Large language models have sparked innovation across nearly every industry. From automating complex tasks to generating human-like text and powering new user experiences, the technology is rapidly transforming the business landscape. However, as AI adoption accelerates, itās crucial to recognise a less glamorous, but equally important truth: LLMs are not always the right solution.Ā
In this article, Iāll explore why and when organisations should not reach for LLMs, drawing from two anonymised cases of startups Iāve mentored in fintech and healthcare. But first, we need to understand what LLMs are, what they require, and where their limitations lie.Ā
What Is AI? What Are LLMs?Ā
Artificial intelligence (AI) broadly refers to systems that perform tasks requiring human-like intelligence, such as learning, reasoning, and decision-making. Machine learning (ML), a core branch of AI, enables systems to improve over time by learning from data, using algorithms like decision trees, regression, or neural networks.Ā
Large language models (LLMs) represent a specific type of AI designed to understand and generate human-like text. Trained on massive datasets, these models excel at tasks like summarisation, text generation, and question answering, making them incredibly versatile across industries, from legal contract analysis to customer support.Ā
What Do These Models Need?Ā
At their core, AI models are only as good as the data that fuels them. LLMs require: High-volume, high-quality text data for training; Robust fine-tuning datasets for specific tasks; and human feedback loops to guide alignment.
Key Limitations of LLMsĀ
Despite their power, LLMs come with major constraints:Ā
- Hallucination: LLMs sometimes produce outputs that are factually incorrect or fabricated, even when sounding confident and plausible.Ā
- Explainability: Unlike simpler machine learning models, LLMs behave like āblack boxes.ā Itās often difficult (or impossible) to explain why a model produced a specific result.Ā
- Dependence on Input Prompts: LLM outputs heavily rely on the quality and structure of the input prompt, adding unpredictability in high-stakes environments.Ā
- Data Mismatch: If the LLM hasnāt been trained on domain-specific data, or if sensitive or proprietary data canāt be shared, the model may underperform or produce unreliable results.Ā
These limitations are not merely technical, they can fundamentally undermine business outcomes. Two startups Iāve worked with recently learned this the hard way.Ā
The Healthcare Case: No AI Without Good DataĀ
One healthcare startup I mentored wanted to introduce AI-powered diagnostics into hospitals. Their goal was to assist medical professionals by flagging high-risk cases and suggesting treatment options based on patient history. Their ambition was admirable, but their biggest obstacle wasnāt the model or the algorithms. It was the data.Ā
The hospitalās records were riddled: incomplete patient histories, missing fields and unstructured notes, inconsistent data entry practices among medical staff and poor documentation of follow-up outcomes.
Without high-quality, reliable data, any machine learning solution was bound to fail. In this case, the most responsible and impactful advice was simple: āDonāt implement AI yet. First, invest in digitisation and data hygiene.āĀ
What this case highlights is critical: AI cannot compensate for poor data foundations. In industries like healthcare, the priority must first be:Ā
- Structured, consistent data captureĀ
- Training staff on the importance of proper documentationĀ
- Developing data governance policiesĀ
Once data quality improves, predictive models can be explored. Until then, AI efforts are likely to waste time, money, and resources.Ā
The Fintech Case: Know Your ToolsĀ
Another startup I mentored, this time in the fintech sector, was building a platform to predict customer financial behaviour, such as default risk, churn probability, and spending patterns.Ā
The founders were excited about LLMs and planned to use them to generate these predictions. Their thinking was that LLMsā ability to process unstructured text and complex correlations would give them an edge. However, they quickly ran into multiple challenges:Ā
- Unexplainability: In finance, regulatory and internal compliance standards require explainable models. LLM predictions couldnāt offer justifications or traceable reasoning, making them unusable for high-stakes decisions.Ā
- Inflexibility: LLMs struggled to incorporate structured financial data, such as transaction histories or credit scores, in a reliable way. These models excel at language tasks, but not at handling tabular numerical datasets.Ā
- Missed Opportunities: Traditional statistical methods were far more accurate and transparent for this problem. These models could be easily tuned and enriched with human input.Ā
Ultimately, I advised them to pivot back to traditional machine learning and statistical methods that offered better predictive performance, transparency, and compliance. The LLM approach not only underperformed, but it also added unnecessary complexity.Ā
Final Thoughts: LLMs Are Not a Silver BulletĀ
LLMs and other advanced AI technologies are powerful tools, but they are not a universal solution. In many cases, the right answer may be:Ā
- Investing in data infrastructure firstĀ
- Using simpler, well-understood modelsĀ
- Prioritising explainability over complexity
C-suite leaders, product managers, and founders must ask: āIs this a problem that actually requires an LLMāor am I using it because itās trendy?ā Responsible and effective AI isnāt just about whatās possibleāitās about whatās relevant.Ā Ā
AUTHOR
Tomer is a tech leader in the UK. He is currently at Google Cloud, with previous experience at Microsoft and several startups. Tomer holds a BSc in Computer Science and an MBA from the University of Oxford.