Paris, Missouri or Paris, France?
Let me go out on a limb here and suggest that if you plan to go to Paris, you’re thinking about the one in France. With the Eiffel Tower. The Louvre. Croissants, baguettes, and fine wine. No offense to Missouri, but it’s not quite the same experience.
This is the problem with AI Agents revolutionizing workplace automation: context. Humans are good at context. Our brains contain massive amounts of information that we’ve learned, either passively or actively, over the years. We almost intuitively understand that when someone says “Paris,” they mean the one in France.
Agents, however, only understand context to the extent that they can find the relevant data that will tell them what the context is. And the most popular solutions to the context problem are badly broken. There’s a reason 74% of AI projects are failing in the PoC valley of death. They may work on simple tasks, but they flounder in the real world where context matters.
Why is Context So Hard?
You might be thinking that context isn’t that hard. After all, doesn’t ChatGPT handle context just fine? Yes, within its token limit and within a single conversation or project, it does a remarkably good job of managing the context of that conversation. That’s very different from sending an AI Agent off to accomplish complex, context sensitive tasks.
Note that I’m distinguishing between complicated, multi-step, fairly unambiguous tasks, which agents can often do reasonably well, and tasks that require a higher order understanding of circumstance, context, and assumptions about the world.
The second type of task is difficult precisely because, unlike people, agents don’t come with a lifetime of experience navigating the world. They have only the information we make available to them. That’s where things get messy. It’s hard to access data, hard to identify the right data, and particularly hard to do this rapidly on-the-fly.
Data, Data Everywhere
And nobody stops to think… about why.
Corporate data landscapes are a hodgepodge of technologies, applications, and data formats. 92 of the top 100 banks still use IBM mainframes sitting next to the latest cloud databases and AI applications. Telecom uses different systems for network telemetry, billing, customer data, service data, and so forth. I could go on but you get the point: lots of technologies, lots of data, lots of formats.
Here’s the kicker though: where the data is located is a critical part of the context. That’s part of what an AI Agent needs to understand in order to handle the complex, context sensitive tasks we dream of automating.
And our current approaches are making the problem worse, not better.
The Failure of Centralization
We have databases, data warehouses, data lakes, data lakehouses, and data lagoons (okay, there is no data lagoon. I’ll bet you were wondering though). Each of these is an attempt to centralize data into one easily accessible location. Centralization doesn’t work anymore, if it ever did (it was certainly lucrative for centralization providers).
Data centralization sounds so simple in theory, especially if you’re not deeply versed in the ways of data and technology. The problem is that data has no inherent meaning: it’s just bits and bytes, 1’s and 0’s. The knowledge contained in data comes from knowing how to interpret it, and context is a major part of that.
Agents are like humans in that respect: information taken out of context is ambiguous, misleading, or incomprehensible. Without context, your agent could easily send you to Paris, Missouri rather than Paris, France.
It’s not just that centralization destroys context. Centralization is time-consuming, expensive, and brittle. A centralization project can easily involve months or years of effort, tremendous overhead (since every application that depends on data in a specific format still needs access to the right data when it needs it), and the pleasure of constantly tuning thousands of fragile ETL pipelines.
Data governance and data quality become harder after centralization. Centralization also creates a single attack surface for hackers. And, of course, once you’ve embarked on a centralization project, it’s extremely difficult to justify giving it up, the sunk cost fallacy in action.
Centralization doesn’t just wreck context, it increases vulnerability and complexity. Then it charges you a pretty penny for the privilege.
And then there’s vector databases.
Vector Trap
Vector databases are touted as the solution to making AI work in business. In reality, vector databases are the centralization problem on steroids. Not only do vector databases bring all the problems of data centralization, they lack many of the key capabilities of modern databases:
- No robust aggregations or analytics: operations like SUM, COUNT, GROUP BY, moving averages, and window functions are not supported (or require external tooling).
- Limited SQL-like joins or complex filters: metadata filtering often exists, but inner joins, relational queries, or combining vector conditions with rich Boolean logic typically must occur upstream in an application layer.
- Poor transactionality and versioning: vector stores lack ACID semantics—parallel inserts and updates can cause race conditions, and rollbacks or consistent snapshots are rare.
- Absence of temporal, geospatial, or graph capabilities: many analytics scenarios rely on temporal trends or network relationships, which vector engines don’t natively support.
- Explainability and semantic transparency: similarity-based embeddings are opaque. When a vector DB returns a result, it’s difficult to trace why that result matched—a problem for compliance, data science validation, or auditability.
Organizations often require auxiliary systems to fill these gaps—re-joining data, performing aggregations, logging lineage, or rebuilding consistency. That adds cost, complexity, and latency. Trying to build agents on that foundation is recipe for failure, which is exactly what’s happening.
Adding insult to injury, just the other day, AWS announced that their new Amazon S3 Vectors lets you drop raw embeddings straight into S3 with no clusters and no tuning, at up to 90% lower cost for ingest, storage, and queries. Think about that: the entire world of vector database centralization just went up in smoke.
So what does work?
Orchestration Bests Centralization
An Intelligent data orchestration layer built with a Zero ETL approach can search data where it lives and use AI to analyze, deduplicate, and relevancy rank results. Zero ETL means data stays in place, and context is preserved. This is vital for effective AI Agents.
The benefits touted by vector databases can be achieved using as-needed in-memory vector processing, neatly sidestepping the drawbacks that come with using actual vector databases.
In addition to being considerably cheaper than vector databases, orchestration enables lighter, faster, more agile systems, which translates into businesses that are more nimble, more responsive to customers, and better able to take advantage of brief windows of opportunity. Developing using intelligent orchestration is also considerably faster than development involving data centralization.
Orchestration layers make possible complex AI and agentic AI projects without dragging a vector database along or requiring a technology remodel. Skip the months (or years) long ETL-based data centralization project. An orchestration layer, such as SWIRL AI Connect (www.swirlaiconnect.com), that uses a Zero ETL approach will:
- Dynamically query multiple systems simultaneously in real time
- Vectorize only what’s relevant, when needed
- Respect existing access controls, SSO, and PII policies
- Deliver contextual answers across documents, databases, APIs, and cloud platforms
- Give agents the reliable, trustworthy data they need to accomplish your goals
Unlike centralization and vector databases, SWIRL requires minimal change to your existing technology and data ecosystem. By discarding the outdated assumption that data centralization is good and necessary, we free ourselves to stop fighting with technology and instead use it to sharply increase business success. It’s how we make AI work at the speed of business.
Take Back Control of Your Tech Stack
When businesses engage in increasingly complex and time-consuming efforts to centralize and synchronize data, they lose control of their technology stack. It’s time to discard outdated assumptions and take back control.
Intelligent orchestration layers, such as SWIRL, give you the power you need without the overhead that centralization and vector databases bring. Forget about managing complex ETL tasks and data migration, struggling with synchronization and data drift, compensating for the limitations of vector databases, addressing governance and security gaps, and working with ever-growing infrastructure complexity. Like the time of the dinosaurs, the time of ponderous data remodeling projects is over.
With SWIRL you can start developing today, not in six months or a year when your centralization project is complete. If it ever is.
For more information, download our white paper, request a demo, or visit our website.