AI & Technology

Inside the Architecture of an MCP Server for PostgreSQL

By Dave Page, CTO of pgEdge

As AI agents move beyond chatbots and code completion into direct interaction with production databases, the Model Context Protocol (MCP) has emerged as the standard interface between large language models and external systems such as PostgreSQL. The temptation is to treat an MCP server as a simple connector; something that shuttles SQL from an LLM to a database and returns the results. In practice, however, designing an MCP server for PostgreSQL is fundamentally an exercise in control: controlling what queries run, what data comes back, and how much context each interaction consumes. Without deliberate architecture, teams risk unsafe SQL execution, overexposed schemas, runaway token costs, and the kind of production instability that makes a DBA’s phone ring at 3 a.m.  

Having spent considerable time building and open-sourcing an MCP server for Postgres, I can share some of the practical lessons and trade-offs that shaped its design. 

Architecting the MCP Layer: From Connector to Control Plane 

The first architectural insight is that an MCP server cannot simply proxy SQL between an LLM and PostgreSQL. A naive pass-through approach treats the database as a function the AI can call at will, which is roughly equivalent to giving an intern unrestricted superuser access to your production cluster on their first day. Instead, the MCP layer must act as a mediation and policy layer, one that enforces explicit security boundaries before any query reaches the database.  

In practice, this means designing around two fundamental pillars: role-based access controls and read-only defaults. Connections should default to read-only at the session level, so that even if query-level validation fails, PostgreSQL itself will reject writes. Write access should be opt-in per database, not a global toggle, and the MCP server’s tool descriptions should dynamically reflect whether a given connection is read-only or write-enabled so that the AI agent itself understands its boundaries. Configuring the MCP server to connect using a PostgreSQL role with minimal privileges adds another layer of defence in depth; even if the MCP layer’s controls are somehow bypassed, the database itself will enforce the role’s access restrictions on schemas, tables, and operations.  

Perhaps most importantly, every piece of AI-generated SQL must be treated as untrusted input. This is no different from how a well-designed web application treats form submissions; you validate, constrain, and sanitise before execution. The MCP server should wrap queries in transactions with appropriate isolation levels, use safe identifier quoting to prevent injection in any server-constructed SQL, and reject attempts to manipulate session variables that could circumvent read-only protections. It is worth noting that LLM-generated SQL cannot straightforwardly use traditional parameterised queries, because the agent constructs the entire statement including table names, column references, and filter values as a single string; the MCP layer must therefore rely on structural validation and execution-time controls rather than parameter binding alone. MCP is infrastructure, not merely a convenience layer, and it deserves the same defensive design rigour as any other component in the data path. 

Building Query Safety into the MCP Server 

Even semantically valid SQL can create serious operational strain in PostgreSQL. A perfectly legal SELECT * against a table with fifty million rows will cheerfully consume memory, saturate I/O, and block other workloads whilst the database dutifully returns every last record. An AI agent generating queries has no inherent understanding of the operational cost of what it produces, which means the MCP layer must enforce guardrails on the agent’s behalf. 

Statement classification is a useful starting point. Categorising incoming queries as SELECT, DDL, DML, or DML-with-RETURNING allows the MCP server to apply different policies to each type. Row limits should be enforced by default (a sensible ceiling is 100 rows, with a configurable maximum of perhaps 1,000), and the server should transparently append LIMIT clauses to SELECT statements when the agent omits them. A particularly useful pattern is to fetch one row beyond the requested limit; this lets the server detect truncation and inform the agent that additional pages are available, enabling pagination without ever returning an unbounded result set. 

The critical architectural principle here is separating query generation from execution approval. The AI agent generates SQL based on its understanding of the task, but the MCP server independently decides whether and how to execute that SQL. This separation creates a natural checkpoint where restrictions, execution guards, and cost controls can be applied without burdening the agent with operational concerns. 

It is worth noting that the MCP specification itself does not yet provide hard mechanisms for enforcing user approval before tool execution. The current spec defines tool annotations such as readOnlyHint and destructiveHint, but these are explicitly advisory; the specification recommends that clients keep a human in the loop for sensitive operations, without mandating it. There is active community discussion around stronger enforcement models, including proposals for cryptographically signed approvals and mandatory action routing for enterprise security, but for now, the MCP server itself must shoulder the responsibility for execution safety. 

Designing for Token Efficiency 

When AI systems interact with databases, every response consumes limited context space and computational budget. A query that returns a few hundred rows of wide tabular data can easily consume thousands of tokens; multiply that by a multi-step reasoning chain, and you can exhaust an agent’s context window before it finishes its task. Token efficiency is often framed as a cost concern, but in practice it has a far more direct impact on reliability, responsiveness, and system stability. 

MCP server design should prioritise returning only what is necessary, in the most compact representation possible. One effective approach is to use tab-separated values rather than JSON for query results; TSV is significantly more compact and, counterintuitively, easier for LLMs to parse than deeply nested JSON structures. Type-aware formatting matters too: converting PostgreSQL’s native types (numerics, timestamps, UUIDs, intervals) into clean, human-readable strings avoids the verbosity of raw type metadata. NULL values present an interesting design choice: representing them as empty strings in the output is significantly more compact than repeating the word ‘null’ thousands of times across a result set. Strictly speaking, NULL and an empty string are semantically different in PostgreSQL (NULL means ‘unknown’, whilst an empty string is a known value that happens to contain nothing), but in practice this distinction rarely matters to an LLM that is reasoning about query results rather than performing precise data comparisons. 

Schema introspection deserves particular attention. A get_schema_info tool that dumps every table, column, index, and constraint in a database can easily produce tens of thousands of tokens; far more than any agent needs for most tasks. Filtering schema responses by specific schema name and structuring the output so that AI agents can request progressively more detail keeps the context window focused on relevant metadata. The goal is to give the agent just enough information to reason correctly, not to reproduce the contents of pg_catalog in its entirety. 

Operationalising the MCP in Production 

Production MCP deployments introduce challenges that are invisible during development. When multiple AI agents share a PostgreSQL instance, connection management becomes critical; each agent session needs its own isolated connection pool, keyed to its authentication token, to prevent cross-session data leakage and ensure that one agent’s workload cannot starve another of connections. Connection pools should be configurable per database, with sensible defaults for maximum connections, idle timeouts, and health check intervals, and multi-host configurations should support failover with automatic health checking.  

Observability is arguably the most underinvested area in early MCP server designs. At a minimum, the MCP layer should log every query it executes along with metadata: row counts, whether results were truncated, estimated token consumption, and execution duration. A structured trace log in JSONL format, with entries for tool calls, tool results, resource reads, and session lifecycle events, provides an audit trail that is invaluable for debugging, compliance, and performance analysis. Sensitive parameters (passwords, API keys, session tokens) must be automatically redacted from trace output; it is remarkably easy to accidentally log credentials when recording the full input and output of every tool invocation. 

Long-running agent sessions present their own challenges. As an agent accumulates context over dozens of database interactions, earlier results become less relevant but continue to consume the context window. Designing the MCP server to support concise, paginated responses from the outset gives agents (and the orchestration layers above them) the building blocks they need to manage context effectively. Combined with per-request tracking (session IDs, request IDs, token hashes), this creates a system where operators can trace any agent action from prompt to SQL execution and back, even across sessions that span hours. 

Looking Forward 

Designing an MCP server for PostgreSQL is fundamentally an architecture discipline; it requires the same careful thinking about security boundaries, failure modes, and operational concerns that we apply to any other piece of critical infrastructure. Security, query safety, token efficiency, and observability cannot be bolted on after the fact; they must be woven into the design of the MCP layer itself. As AI-native workflows expand beyond prototypes into production enterprise environments, purpose-built MCP servers will become a core component of how organisations integrate AI with their PostgreSQL deployments, and the teams that treat this integration as a first-class architecture problem will be the ones that succeed. 

Author

Related Articles

Back to top button