Future of AI

The case for Small Language Models

By Mohan Varthakavi, VP of AI and Edge at Couchbase

For many enterprises, success means bringing essential operations closer to the people they serve to better meet their needs. That could involve simplifying online grocery ordering, helping staff handle customer requests in real time or ensuring that financial institutions can quickly detect and address fraud.

With the rapid advancement of artificial intelligence (AI), the dream of achieving operational efficiency is more attainable than ever. AI tools now offer businesses the potential to create applications that not only meet immediate needs but can also anticipate future requirements, surface actionable insights, and deliver enhanced customer and employee experiences. However, turning this potential into consistent, high-quality, real-world performance remains a significant challenge for many enterprises.

As of 2024, over 40% of large enterprises have actively deployed AI, with another 40% experimenting or in pilot phases (IBM). At the same time, more than 80% of AI-using businesses are exploring language models as part of their toolset (Gartner). While businesses are keen to adopt AI, they have difficulty in building applications that meet the pace, scale and precision that modernity requires. How can enterprises ensure these systems are fast and flexible enough to adapt to ever-changing needs? How can they trust them to deliver accurate, safe outputs? And how can they deploy them responsibly, especially at scale?

While Large Language Models (LLMs) have dominated the headlines, Small Language Models (SLMs) often offer a more practical and manageable path forward— especially for specialised business use cases.

Choosing the right approach for the task

When designing an AI application, the first step is to understand its purpose and the users it is meant to serve. This insight is crucial in choosing the right tools and system architectures to support the application’s goals.

Although LLMs are highly capable and offer broad functionality across diverse contexts, they come with inherent challenges. For businesses, the stakes are often high and relying on LLMs without carefully managing their limitations like the risks of hallucinations or biases can be risky. These concerns are particularly pronounced when AI is used to make critical decisions: an LLM might generate plausible answers, but without proper oversight, those answers could be misleading or outright wrong.

In these cases, a SLM, designed specifically to understand the nuances of a given domain, can be more effective. By narrowing the focus, SLMs can provide greater precision and reliability, handling specific tasks with more consistent accuracy.

Efficiency, control and domain expertise

The main advantage of SLMs lies in their ability to focus. Where an LLM casts a wide net, an SLM is designed to deliver precise results within a particular subject area. This makes them ideal for use cases where specificity matters like customer service automation, fraud detection or compliance reporting.

Because SLMs are typically trained on proprietary or highly curated datasets, organisations have greater control over the quality and relevance of the data used. This is particularly beneficial for businesses that deal with sensitive or regulated information. Using proprietary datasets also helps reduce the risks associated with using third-party or open-source data, which might introduce biases or unreliable information into the system. In turn, this improves confidence in the model’s outputs and ensures alignment with internal standards and regulations.

Another benefit of SLMs is their ability to operate securely within an organisation’s own infrastructure. Unlike LLMs that may rely on external data sources or cloud-based processing, SLMs can be trained, deployed and maintained entirely within a company’s own environment. For businesses that handle sensitive data, this provides peace of mind by reducing exposure to privacy violations or compliance issues.

These models are already being used effectively in consumer devices. For example, many recent smartphones rely on SLMs to power features like voice assistants or predictive text, without needing to connect to the cloud. This allows for intelligent functionality in low-latency environments, where immediate responsiveness is required, and on devices with limited processing power.

Building on solid foundations

While SLMs offer significant benefits, their performance still depends on access to high-quality data and a robust infrastructure. They require an architecture that can support real-time inference and are most effective when deployed close to where data is generated.

Equally important is the question of where and how SLMs process data. As we’ve seen, they can be implemented directly onto mobile and edge devices which is essential for many use cases. In sectors like automotive, logistics and manufacturing, real-time decision-making is non-negotiable. For example, autonomous vehicles need to process information almost instantaneously to ensure both safety and responsiveness. In these scenarios, SLMs are ideal: compact enough to operate efficiently on devices with limited processing power, yet powerful enough to deliver fast, accurate results.

Many foundational models now offer lightweight versions that are optimised for local deployment. These smaller models bring a number of advantages, including lower computational costs, reduced latency, and greater energy efficiency. This makes them particularly suited to edge devices and local machines. They are also typically easier to interpret and fine tune. However, these benefits come with trade-offs: smaller models may exhibit reduced accuracy, limited knowledge retention and weaker adaptability, often requiring more extensive tuning for task specific performance.

Finally, there’s the critical issue of data ownership and control. Retaining oversight of the data used to train and run any model is essential for both compliance with various data protection laws and assurance that the model learns from data that is accurate, complete and trusted by the organisation.

Get the balance right

No single model will suit every purpose. Companies should base the choice between an LLM and an SLM on the specific needs of the application, the quality and sensitivity of the data and the technical requirements of the environment in which it will operate.

SLMs offer businesses a compelling alternative, especially for those seeking to develop high-performance AI applications that don’t come with the overhead and risks associated with larger models. The case for SLMs is clear: when the right balance is struck between performance, cost and control, SLMs can deliver better results than LLMs

Author

Related Articles

Back to top button