With each new generation of Generative AI (GenAI) technology, the big tech businesses behind it will tout the new competencies and capabilities they have baked into their models, and users playing with it will likely have, at some point, a common response: how did it know how to do that? Whether it’s offering nuanced answers about choosing between local restaurants or explaining complicated concepts that human experts would be hard-pressed to define, a big part of the current excitement for GenAI lies in its (sometimes spooky) ability to know.
Of course, the truth is that the large language model (LLM) sitting underneath that chatbot doesn’t know these things – or at least, not in the way that we usually understand ‘knowing’. GenAI only encodes and recombines whatever information that we give it: in the case of the commercial models that most of us have used, vast libraries of web data and written literature.
Nonetheless, the fact that everything that comes out of GenAI is something that was, at some point, inputted is easy even for experts of the technology to forget. That’s the power of computers replying with natural language rather than in strictly structured, formulaic responses. And that’s a big reason why GenAI is such a transformative technology, with the potential to automate everything from hyper-personalised marketing and customer support, to accurately filling and filing paperwork, to information retrieval for engineers and maintenance crews.
The dark magic of GenAI
The ability of GenAI to make us forget its inner workings is also, however, at the root of a great deal of risk associated with the technology. Many of these have been well publicised. For example, hallucinations – in which training data gets incorrectly recombined into untrue, misleading, or nonsensical outputs – are harder to spot when users treat the tools as though they know things in the traditional sense.
Data privacy concerns, likewise, can arise when users assume that whatever a GenAI chatbot delivers is something that the business responsible for it is allowed to reveal. From the LLM’s perspective, unless specifically told otherwise, a person’s address or details of their medical history which get inadvertently included in the training process is just data like any other.
And naturally, copyright issues remain a major concern across the growing GenAI market. While AI businesses are pursuing diverse avenues to resolve open questions around the relationship between copyright and GenAI, the nature of the technology means that inadvertent copyright and intellectual property violations remain a real possibility in commercial GenAI usage.
The limits of public data
This dark side of GenAI – the real risks that counterbalance its clear benefits – is one of the things restraining many businesses from investing wholeheartedly in the technology.
But there can also be question marks about what happens when none of these pitfalls are encountered. For instance, GenAI might deliver the knowledge you need, in the format you want, but that information might only be current as of the LLM’s training date some months or even years ago, and so miss elements of today’s reality.
Being trained on publicly available data, publicly available AI might know little to nothing about what is most important to your business. There’s no problem if you want to know about Henry VIII, of course, but what if you need a quick run-down of Policy Reference BCF-672876?
And often what we need is not just reference material, but insight into an ongoing situation. Even if the LLM powering your GenAI tool of choice has encoded information about Policy Reference BCF-672876, can it find and summarise your colleague’s latest report on claims being made against it?
Augmenting the artificial intelligence
Challenges like this need a different approach. The solution comes back to that basic fact about the technology: if GenAI can only tell us what we tell it, we need a better way to tell it what we want it to tell us.
That, in a simplified nutshell, is the problem addressed by Retrieval-Augmented Generation, or RAG. While RAG is not a brand-new idea (having first been put forward in 2020) it has been gaining traction recently as a missing ingredient to make GenAI relevant and effective in many more use cases.
In essence, RAG is an additional step on the journey from prompt to response in an interaction with GenAI. Users’ questions and inputs are first matched against a database that has been structured in a way that an LLM can interact with, and any positive results from that search are passed along with the query to the LLM that the system is using.
This gives the GenAI tool access to data well beyond whatever it was originally trained on – anything that can be stored in a database, from support documentation to financial records to real-time sensor data, can form part of what we are telling the GenAI.
This approach has several advantages. It means that we can draw private or sensitive data into the responses of the most powerful LLMs without having to put that data through the initial training process. Because the RAG database can be updated separately to the LLM, this approach can also encompass real-time business data while avoiding the high cost of retraining the model.
And a more easily-managed repository of data that is known to be good – whether that is in terms of its truthfulness, its specificity, or its appropriateness – provides a route towards minimising the risks that GenAI carries. We might, for example, create a system using the linguistic capabilities of best-in-class models, but with a rule that any technical data must be sourced from and cited in the business’s internal documents.
Preparing to speak to GenAI
While RAG solves many of the pain points associated with professional usage of GenAI technology, it’s important to understand that it isn’t an instant fix. The data that a business needs for any given use case is likely to be spread across several internal sources, from CRM and ERP systems to SaaS applications and legacy data repositories. Traditional data delivery methods, built around step-by-step workflows of sourcing, staging, storing, and sharing data, are too slow, costly, and inconsistent to feed RAG systems at the pace and volume they require.
That means that the first step towards RAG is reconsidering the business’s underlying data strategy. Exposing ‘AI-friendly’ views of data can be achieved with a unified, consistent virtualisation layer which sits across the underlying data complexity and provides the necessary metadata for the LLM to understand what it is looking at. From there, businesses can more easily generate and optimise databases that feed the information we need into future GenAI applications.
One day, if you type ‘What is RAG?’ into your business’s GenAI tool of choice, this article might well contribute some small part of the answer you retrieve. But if your question is ‘How many workflows in my business rely on RAG?’ – well, you’d better have Retrieval Augmented Generation properly implemented if you want a good answer.