Imagine investing millions in AI, only to watch it fall short, not because the algorithms are flawed or the talent lacking, but because the data going in was poor to begin with.

Semarchy’s recent survey found that fewer than 1 in 50 enterprises genuinely escape the pitfalls of bad data, even as 74% plan to increase their AI investments in 2025. Despite the headlines and hype, the foundational element of data is all too often overlooked in AI developments.

As businesses race to deploy the latest AI models, they often bypass the essential groundwork of data governance and quality management. The result is operational chaos, loss of trust, and wasted time and resources. The reality is that most AI projects that fail do so because of neglected data basics: setbacks that are entirely avoidable with the right approach.

Rubbish in, rubbish out

Much has changed in the world of business technology, but one rule remains ironclad: rubbish in, rubbish out. No matter how advanced the AI software, poor data at the start means disappointing and sometimes damaging outcomes at the end. Poor data may be incomplete, siloed, outdated or poorly formatted. When models are trained on these data sets, the results will mirror the same problems with inconsistent or biased results. These errors can cost businesses in operational costs and even regulatory issues. 

Rather than blaming AI itself, it’s time to evaluate the foundations beneath it. Building AI on bad data is like constructing a house on unstable ground; the foundation fails, and the entire project is compromised.

Yet leadership priorities around AI remain misaligned. Semarchy’s research found that only 13% of Chief Data Officers (CDOs) prioritise AI, compared to 41% of Chief Technology Officers (CTOs). This suggests that AI is still being driven more by product innovation than strategic data leadership, highlighting a need for better alignment across executive roles.

Why data quality gets overlooked

In the rush to ‘do AI’, many organisations neglect the first step: securing a reliable data foundation. Data quality is often perceived as unrewarding, time-consuming or as something to put off until pilot projects are underway.

But that deferral leads to expensive issues later. Consider one high-profile example from 2024: McDonald’s US ended its three-year AI drive-thru project with IBM following persistent failures to interpret customer orders and viral customer complaints. Had the foundational dataset been more diverse, complete and governed from the outset, the outcome may have looked very different.

When addressed upfront, strong data foundations lead to AI projects that are faster to launch, easier to manage, and more likely to deliver measurable impact.

How master data management (MDM) comes into play

Master data management (MDM) rarely makes AI headlines, but it quietly plays a pivotal role. MDM unifies, standardises, and governs key business data sets, ensuring clean inputs before AI models are trained or deployed.   

A practical approach starts with a comprehensive data audit:

What exists?

Where is it stored?

What condition is it in?

This exercise often uncovers duplicate records, missing data, or inconsistent details, any of which could undermine AI accuracy.

From here, organisations need routines for deduplication, enrichment, cleansing and standardisation. Addressing process gaps and signal errors at source, even as data volumes dramatically scale. 

Understanding data lineage, including historical logs or files from legacy systems is essential to avoiding hidden stresses on your data. Many businesses only discover these cracks after a mistake. To avoid this, proactive data management should be a requirement, not an afterthought.

Addressing data issues when preparing for AI

A smart AI preparation strategy begins with a thorough data audit and means to catalogue source ownership and quality.  After conducting your audit, you’ll likely uncover inconsistencies as well as obsolete datasets. Don’t try to fix everything at once; instead, focus on the biggest risks first.

For example. Imagine a retail business preparing for AI-driven inventory forecasting discovers that sales data from several store locations is delayed or incomplete. Instead of going ahead with the AI project, the business forms a small cross-functional team, including IT and store management, identifies where the gaps occur, and implements syncs at specific sites. The result? Better sourced data and fewer surprises during AI deployment.   

In cases of widespread data issues, businesses should consider the following: 

Put a pause on AI developments and the data first to save time, budget, and headaches down the road.

Implement ongoing data quality monitoring in place to capture future issues as soon as they arise.

Building reliable AI on solid data

Trust, performance, and innovation all begin with quality data.  AI can transform businesses, but only for those willing to do the unglamorous work of tidying up their data at its foundation to accelerate as they step forward.

Author

AIJ Guest Post

View all posts

AIJ Guest Post 38 minutes ago

3 minutes read

Why AI fails: the unsexy reality of business data

By Matthew Gibbons, NEMEA and APAC VP at Semarchy

Rubbish in, rubbish out

Why data quality gets overlooked

How master data management (MDM) comes into play

Addressing data issues when preparing for AI

Building reliable AI on solid data

Author

Rubbish in, rubbish out

Why data quality gets overlooked

How master data management (MDM) comes into play

Addressing data issues when preparing for AI

Building reliable AI on solid data

Author

Related Articles

Expert Insights: The 2025 State of AI in Business

Using AI to Predict Outreach Fatigue Before It Detracts from Campaigns

Why UK Firms Need Specialised Payment Protection in the Age of AI

ChatGPT Study System: Harvard Student Lands FAANG Internships Using AI