I think we need to talk about how AI is creating more technical debt than it’s solving. Let me give you a real example: one of our engineers recently tried to build an API for our training pipeline using AI tools. It looked like a perfect way to speed things up – the code all looked right, ran fine initially. But what actually happened is that he spent nearly two weeks untangling and debugging what could have been built properly in less than a day. And this wasn’t even the first time. He’d sworn never to make this mistake again after a similar experience months earlier. But the allure of quick solutions, right?
When your AI electrician doesn’t leave a manual
Let me explain why this happens using an analogy I think about often – it’s like electrical wiring in a house. You can hire two or three different electricians and they would wire things in different ways – one might start in the kitchen, another at the electrical box, another might start outdoors. And they may be labeling in different ways, using different colored wiring. So when something breaks and you ask a new electrician to come in, they’ll tell you “I actually probably have to spend a whole day trying to figure out what the last one did.” There’s no documentation, no walkthrough. And who knows what has changed over time?
A million lines of nobody’s code: the scale problem
With AI, this problem gets exponentially worse. Let’s say a machine writes a million lines of code – it can hold all of that in its head and figure things out. But a human? Even if you wanted to address a problem, you couldn’t do so. It’s impossible to sift through all that amount of code you’ve never seen before just to find where the problem might be. In our case, what made it particularly tricky was that the AI-generated code had these very subtle logical flaws: not even syntactic issues, just small problems in the execution logic that you wouldn’t notice at a glance.
The volume of technical debt increases not just because of complexity, but simply because of the sheer amount of code being shipped. It’s a natural law. Even as humans, if you ship more code, you will have more bugs and you will have more debt. If you are exponentially increasing the amount of code you’re shipping with AI, then yes, maybe you catch some issues during review, but what slips through just gets shipped. The volume itself becomes the problem.
From autopilot to guardrails: how to actually use AI without breaking everything
I think the solution lies in far better communication throughout the whole organisation, coupled with robust processes and tooling. Here are my recommendations, from our own experience.
- Set clear ground rules: Now you have to be much more explicit about your policies, even the unspoken ones. This starts with comprehensive coding guidelines that specifically address AI-generated code. We’ve learned to be incredibly specific here. For instance, our guidelines now include rules like “AI-generated code must include explicit error handling for each external service call” and “any function over 20 lines must include inline documentation explaining its logic.” We even specify naming conventions that make AI-generated code immediately identifiable: it helps when debugging later.
- Customise your AI tools: The tooling side is equally important. We’ve customised our AI tools’ settings to align with our tech stack and standards. Things like prompt templates that enforce our coding style, pre-configured with our preferred libraries and frameworks. It’s like having a new team member who’s been thoroughly onboarded to your ways of working.
- Be aware of what library AI uses: Many teams deliberately don’t use the most up-to-date libraries, and there’s usually a good reason for that. You don’t know whether new versions are backwards compatible or might introduce new bugs. With mission-critical software, it’s common to think, “if it’s not broken, don’t fix it.” But AI will probably default to using the most up-to-date library because that’s considered best practice.
- Test before you generate: Here’s what we’ve learned works: when dealing with mission-critical code, write your tests first. Lay out exactly what you expect the code to do, define your edge cases, then let the AI implement against those tests. It’s like a case of controlling what’s possible before the AI starts generating code.
- Build AI-specific safeguards: We’ve also developed custom linters that catch AI-specific issues. For example, our linter flags when AI generates overly complex nested conditionals (a common issue we’ve noticed) or when it creates functions with too many parameters. We’ve even built validators that check for consistent error handling patterns and logging standards – things AI tools often overlook.
- Automate quality checks: Before any AI-generated code hits production, it goes through a gauntlet of automated checks. Examples are performance benchmarks that measure response times and memory usage, security scans that look for common vulnerabilities and integration tests that verify it works with our existing systems. This might sound like overkill, but we’ve found it catches issues that would be much more expensive to fix later.
Sometimes AI will blast out code in minutes that would take days to architect properly. Sometimes we’ll ban it entirely because the maintenance headache isn’t worth the speed. Being practical beats being perfect. Trust but verify: hard limits, clear standards – and always have someone who knows what’s actually going on. To go back to that house wiring analogy: you need to know exactly what you’re working with before you start adding new circuits. Otherwise, you’re just creating problems for the next person who has to fix it. And trust me, that next person might be you.