Intelligence at the edge

In the not-so-distant past, AI lived almost exclusively in the cloud. Large-scale inference was the domain of hyperscalers and GPU-rich data centres. But in 2025, that paradigm is dissolving. From Google’s Gemini Nano on smartphones to Meta’s LLaMA models running on Raspberry Pi-class boards, artificial intelligence is going local – embedded, offline, and edge-first.

This is not just a cost or privacy optimisation. It’s a tectonic shift in application architecture. Lightweight inference runtimes such as llama.cpp, GGML, and Mamba variants are turning even modest silicon (think Apple’s Neural Engine, Qualcomm’s Hexagon DSPs, or NVIDIA’s Jetson Nano) into real-time reasoning machines. But while models have leapt forward, the infrastructure underpinning them has barely budged.

Enter the embedded database. Or rather, re-enter it – with questions.

From cloud-bound to context-aware: LLMs at the edge

For decades, edge computing meant fixed-function systems: IoT devices sending sensor data back to the cloud. But with transformer models shrinking through quantisation and architectural distillation, these devices are now capable of contextual understanding and reasoning locally.

Examples abound:

Gemini Nano powers summarisation and smart reply entirely on Android devices

Alpaca and LLaMA derivatives are running interactively on laptops and microcontrollers

PrivateGPT enables retrieval-augmented generation without ever sending tokens to external APIs.

Developers now seek agentic behaviour on-device: models that interpret local data, recall user-specific memory, and respond fluidly – all without an internet connection. This requires persistent, evolving knowledge bases that live beside the model. And it’s here that traditional data infrastructure is showing its age.

Where legacy embedded databases fall short

Embedded databases like SQLite, Berkeley DB, or LevelDB are marvels of compact engineering. But they were never built for the demands of intelligent, schema-evolving, real-time applications.

Key shortcomings include:

rigid schemas: updating schemas dynamically (e.g. as LLMs evolve representations) is non-trivial or unsupported

lack of vector support: most cannot store, index, or search vector embeddings natively – making retrieval-augmented generation (RAG) hard or hacky

no temporal context: these databases aren’t designed to act as memory – tracking evolving states, agents, or conversational threads

weak concurrency and streaming: handling multiple reasoning threads, event streams, or multi-modal data (e.g. image + text) at once is challenging.

As a result, developers often bolt together multiple systems: a relational core, a key-value cache, an in-memory vector store. This incurs latency, complexity, and fragility – not ideal at the constrained edge.

What a modern edge-ready database must support

To support AI-native applications at the edge, a new generation of databases must go beyond storage. They must become cognitive substrates – the memory, context, and decision fabric for autonomous systems.

Key requirements include:

Multi-model flexibility. Edge agents may work with structured tabular data, unstructured text, hierarchical documents, time-series metrics, and vector embeddings. A modern database must support multi-model storage and querying without complex glue code.

Native vector preparations. The future is retrieval-augmented. Whether for summarisation, personalisation, or local knowledge grounding, edge systems need persistent vector memory – with fast approximate nearest neighbor (ANN) search and efficient updates.

Dynamic schemas and metadata. Applications evolve. So should their schemas. Systems must support on-the-fly schema changes, flexible metadata, and introspective capabilities – enabling autonomous agents to extend their world models without downtime.

Low latency, high concurrency. Edge inference is reactive. Queries must resolve in milliseconds. Support for concurrent, non-blocking operations, in-memory caching, and direct model-data affinity are crucial.

Security at the core. Edge deployments are inherently exposed. Databases need zero-trust authentication, encryption at rest and in transit, and fine-grained access control – especially when handling user memory or private embeddings.

Local-first, cloud-optional. Cloud synchronisation is useful – but optional. The database should function fully offline, with eventual consistency or cloud merge when reconnected. Think Git for memory, not just Dropbox for data.

The database as a decision engine

As LLMs move to the edge, we must rethink the role of the database entirely. It’s no longer just a store for facts – it’s the stateful memory and decision engine of autonomous applications.

In this emerging view:

The database maintains agentic state – tracking goals, actions, environment, and interactions.

It hosts semantic context – embedding similarity, user preferences, local observations.

It guides reasoning – providing structured grounding for inferences, summaries, or responses.

It supports collaboration – syncing across multiple devices, agents, or even users in mesh networks.

This turns the database into a co-pilot, not just a bookkeeper.

As companies like Open Interpreter, Personal.ai, and Ollama explore multi-agent frameworks, the demand for shared, contextual, low-latency memory becomes paramount. The edge will not be passive. It will reason, adapt, and act – with the database at the core.

Rethinking the stack

The AI edge revolution is not just about smaller models. It’s about re-architecting the entire intelligence stack – from models to memory, runtimes to storage.

The database must evolve accordingly. It must:

Speak vectors and documents natively

Adapt its shape on the fly

Serve agents in real time

Run on anything from a phone to a cloud cluster

Guard data as fiercely as it serves it.

In this sense, we do not just need a new database for the edge – we need a new category. One that blends database, vector store, knowledge graph, and stateful memory into a unified, dynamic system.

Author

AIJ Guest Post

View all posts

AIJ Guest Post 26 August 2025

3 minutes read

Does the edge need a new database?

By Tobie Morgan Hitchcock, CEO SurrealDB

Intelligence at the edge

From cloud-bound to context-aware: LLMs at the edge

Where legacy embedded databases fall short

What a modern edge-ready database must support

The database as a decision engine

Rethinking the stack

Author

Intelligence at the edge

From cloud-bound to context-aware: LLMs at the edge

Where legacy embedded databases fall short

What a modern edge-ready database must support

The database as a decision engine

Rethinking the stack

Author

Related Articles

FEDGPU’s intelligent AI cloud computing power provides users with a stable, technology-driven revenue stream, especially as XRP faces a 25% pullback pressure.

The Rise of AI-Enhanced Semantic SEO: Optimizing Beyond Exact Keywords

Alex Yancher: AI are the first responders, humans are the problem-solvers

Shadow AI: Cybersecurity’s New Blind Spot