How My AI Agent Almost Broke the ERP Database
A few weeks ago, my AI junior dev agent did something that made my stomach drop.
It was fixing a bug in the inventory module — a simple off-by-one error in a stock calculation query. The agent wrote the fix, opened a PR, and I approved it without a second look. Standard workflow.
What I didn’t notice was the side effect: the agent had also “optimized” an index on the orders table. Dropped the old one. Created a new one. In production.
The query planner went from a 2ms index scan to a 12-second full table scan. The API started timing out. Users saw loading spinners. My phone buzzed with alerts.
That was the moment I realized: databases are the hardest challenge for AI agents. Not because agents are dumb — but because databases are unforgiving.
Why Databases Are Different
Andy Pavlo from CMU put it perfectly: “If an agent hallucinates a UI component, the page looks slightly off. If it hallucinates a query or a configuration change in a production database, the entire system can vanish.”
In my ERP, the difference is stark:
| Agent Task | Failure Mode | Impact |
|---|---|---|
| Fix a button color | Ugly UI | Annoying |
| Add a new API route | 500 error | One feature broken |
| Change a DB index | Full table scan | Entire module down |
| DROP a column | Data loss | Permanent |
| ALTER a table | Lock contention | All queries queued |
The stakes are completely different. And the agent doesn’t know that.
The Three Database Traps My Agent Fell Into
Trap 1: The “Helpful” Index Change
The agent saw a slow query in the logs and decided to “fix” it by changing an index. It didn’t realize that index was also used by three other queries in different modules. The fix for one query broke the other three.
Lesson: Agents optimize locally, not globally. They see one slow query and assume the fix is isolated. In a database, nothing is isolated.
Trap 2: The Cartesian Join
The agent was generating a report query. It joined five tables without proper WHERE clauses. The result was a 2-million-row Cartesian product that took 45 seconds to return. The agent thought it was being thorough. The database thought it was being attacked.
Lesson: Agents don’t have an intuition for data volume. A join that looks correct logically can be catastrophic computationally.
Trap 3: The Schema “Improvement”
The agent suggested renaming a column from status to order_status for “clarity.” It wrote a migration, ran it, and broke every view, stored procedure, and reporting query that referenced the old column name.
Lesson: Schema changes have cascading effects that no agent can fully predict without understanding the entire codebase — which it doesn’t.
What I Built to Fix This
After that index incident, I implemented a three-layer defense:
Layer 1: Read-Only Query Layer
The agent never gets direct database access. Instead, it goes through a middleware layer that:
- Routes read queries to a read replica
- Blocks any write/DDL statements
- Logs every query for audit
- Kills any query running longer than 5 seconds
class SafeQueryLayer:
def execute(self, sql: str) -> Result:
if self._is_destructive(sql):
raise BlockedQuery("DDL not allowed from agent")
if self._is_cartesian(sql):
raise BlockedQuery("Query may cause cartesian join")
return self._read_replica.execute(sql, timeout=5)
Layer 2: Query Validation Gate
Every SQL query the agent generates goes through a validator that checks for:
- DROP, ALTER, TRUNCATE, DELETE without WHERE
- Cross-joins without join conditions
- Queries touching more than 5 tables
- Missing WHERE clauses on UPDATE
If any check fails, the query is rejected and the agent gets a clear error message explaining why.
Layer 3: Human Gate for Schema Changes
The agent can suggest schema changes — but it cannot execute them. It writes a migration file, opens a PR, and I review it. This adds 5 minutes to my day and saves me from production incidents.
The rule is simple: the agent can read anything, write nothing, and suggest everything.
The Result
Since implementing these guardrails:
- Zero database incidents in 3 weeks
- Agent still handles 12+ tickets per week — the guardrails don’t slow it down
- Query quality improved — the agent learned to write better queries because the validator gives specific feedback
- My stress level dropped — I no longer worry about what the agent is doing to the database
The Bigger Lesson
The database challenge for AI agents is not about making agents smarter. It is about designing the right boundaries.
Agents are powerful. But they lack context about production systems — they don’t know which queries are critical, which tables are hot, or which schema changes will cascade into broken views. Giving them unrestricted database access is like giving a junior developer production root on day one.
The solution is not to restrict the agent. It is to design the interface between the agent and the database with the same care you would design any production API — with validation, rate limiting, audit logging, and human gates for dangerous operations.
Your agent can be 10x more productive than a human developer. But only if the database stays standing.
Discover more from Susiloharjo
Subscribe to get the latest posts sent to your email.