DataAI

Before You Blame the AI Model, Check Your Data

By Thomas Urie, President and Chief Operating Officer at FormAssembly, a leading enterprise data collection and automation platform.

According to a recent MIT study, only 5% of AI pilots generate measurable business impact. It’s a staggering number given the fever pitch of AI hype and investment over the first half of the 2020s. However, it’s not a surprising one.  

For all the promise and excitement we’re witnessing firsthand with AI, the inaccuracies and annoyances are hard to miss.  

In our personal lives, the inconveniences are just that — minor frustrations that are quickly overshadowed by the incredible new conveniences we’re experiencing. 

For business leaders, however, the shortcomings of AI carry far greater weight. That same MIT study points to $30–40 billion in enterprise investment into generative AI. 

Most organizations are stuck in proof-of-concept mode. But what’s holding them back isn’t the intelligence of the models. It’s the quality of the data feeding them.  

AI can’t deliver real business outcomes when the foundation is fragmented, incomplete, or inconsistent. 

When Data Lies, Trust Dies 

Every good relationship starts with trust. This is especially true in the relationship between humans and AI. 

The challenge with AI is that trust can be fragile. AI hallucinations are all too real. What looks confident on the surface often hides uncertainty underneath.  

For businesses, understanding what’s feeding the model is just as important as how it’s being used. Is the model drawing from custom, curated data, or from open-source information with no guardrails? Is it recycling AI-generated data and teaching itself misinformation? 

The MIT study found that most GenAI systems fail because they can’t retain feedback, adapt to context, or improve over time. These are all symptoms of poor or unstructured data. When the foundation isn’t clean, every output compounds the errors. 

Garbage in, garbage out. And trust in AI evaporates. 

This isn’t a new problem — poor data quality has always undermined software performance — but AI magnifies it. The consequences are becoming increasingly visible for organizations that jumped in headfirst, investing in tools that were never trained on their space, customers, or operations.  

From Experimentation to Execution 

As Harvard Business Review points out, the organizations realizing real ROI from AI are moving beyond open-ended experimentation toward enterprise-aligned deployments. That means implementing systems that apply AI to specific, measurable use cases that are integrated into existing processes and governed with clear accountability. 

What that actually looks like depends on your business.  

What unique data does your organization have or need that maps directly to your desired outcomes? What integrations and safeguards are needed to connect that data to your AI models? And most importantly, what level of control and visibility do you need across the process from end to end?  

The large AI vendors are making these processes easier. But knowing what’s right for your business requires understanding both your data and your limits. The data you feed into these systems must be as unique as your organization, and it must be reliable. 

That last point can’t be overstated. Dirty data is still one of the biggest challenges facing businesses today. 

The Big Data wave of the 2010s brought an onslaught of unstructured, computationally-intensive data. As data volume increases, processing slows, costs rise, and the usability of that data for LLMs declines. 

Even with advanced enterprise AI tools, the gap between promise and performance often comes down to how data is structured. If a LLM can’t recognize that your “opportunity ARR” field represents revenue, it can’t answer a basic question like “How much revenue did we close in October?” That’s not a failure of the model. It’s a failure of structure.  

Clean Data Starts at the Source 

Before integrating your own data sources into an AI initiative, you have to think about how that data will be interpreted. LLMs are built to digest information written in context — words, explanations, and relationships. If all you provide is a table of metadata with no definition behind it, the model does not know what it’s looking at. 

That’s why the context around your data matters as much as the data itself. Most LLMs can analyze a paragraph in a document and generate an intelligent summary, but drop it into a spreadsheet, and it struggles. The same principle applies to any enterprise AI implementation. When data is structured, labeled, and accompanied by context, the model can draw accurate insights. Without it, it’s just guessing. 

The path to usable AI data starts at the point of intake. Data collection is the front door to AI readiness. Every field name, label, and piece of context added at intake strengthens the model’s understanding later on. AI can only scale human intelligence if it’s fed human clarity.   

When organizations shift from generic data to the unique information that reflects how their business actually operates, the results become more useful, relevant, and accurate.  

That does more than improve output quality. It builds trust. Users are far more likely to rely on AI when the responses feel tailored to their world and aligned with the way they think and work.  

Measuring What Matters 

Success with AI isn’t one-size-fits-all. Like any initiative, it starts with clear objectives and a shared understanding of what you’re trying to achieve. Are you measuring efficiency gains, faster workflows, or deeper adoption across teams? Define those metrics early, then evaluate whether the technology is actually helping you scale. 

While every organization’s goals will differ, trust must remain a constant measure. Trust in the accuracy of the insights, the integrity of the process, and the value the technology adds to human work. And trust in AI’s outputs starts with the inputs it’s fed. 

If you take one thing from this article, it’s that you need to be intentional about how you collect data and how you use it. When teams trust what AI delivers, adoption follows, impact grows, and confidence in the system becomes its own return on investment. 

About Thomas Urie 

Thomas Urie is the President and Chief Operating Officer of FormAssembly, a leading enterprise data collection and automation platform. With over 20 years of leadership experience, Thomas has a strong track record in driving growth and operational success across several technology organizations. 

Social Links 

https://www.linkedin.com/in/thomasurie/  

https://www.linkedin.com/company/formassembly/  

 

 

Author

Related Articles

Back to top button