Your AI Agent Is Bleeding Money — Here’s How to Stop It

Your AI Agent Is Bleeding Money — Here’s How to Stop It

June 1, 2026 · 6 min read

Last week a product called AgentGuard hit the front page of Hacker News. Its pitch was simple: auto-kill AI agents before they burn through your budget. The fact that it got 47 points in hours tells you something — runaway agent costs are becoming a real problem.

I’ve been running autonomous coding agents for over a month now. My pipeline costs about $35/month, but I’ve seen screenshots of people burning $200 in a single afternoon because an agent got stuck in a loop. If you’re building with AI agents and not tracking costs, you’re one runaway turn away from an expensive surprise.


How Agents Waste Money

There are three common patterns I’ve seen:

1. The Infinite Loop. Agent tries to fix a bug, makes it worse, tries again, makes it different, tries again — 40 turns later it’s spent $4 and the code is in a worse state than when it started.

2. The Wrong Model. Using Sonnet/Opus for every tiny task. A simple “add a getter” doesn’t need a $0.15 turn. That’s what Haiku is for.

3. The Context Bloat Death Spiral. Every turn adds context. Context grows → costs increase → agent gets confused → more turns needed → more context added. By turn 20, you’re paying $0.30 per turn for an agent that’s forgotten what it was doing.


The Three-Layer Defense

After a month of running agent pipelines, here’s what actually works for keeping costs predictable:

Layer 1: Hard Budget Limits

When I spawn a Claude Code agent, I set two things:

--max-turns 15 --max-budget-usd 1.00

These are non-negotiable. If the agent can’t fix it in 15 turns under a dollar, it’s either a hard bug that needs a human, or I’m using the wrong model. Either way, it’s not the agent’s job to keep trying.

The key insight: failing fast is cheaper than succeeding slowly. A $1 failed attempt costs less than a $5 successful one that took 40 turns and left the codebase in a questionable state.

Layer 2: Model Tiering by Task

Not every task needs the same model. I use a simple triage:

  • Haiku ($0.0008/turn): QC checks, linting, simple CRUD, adding tests. Under $0.10 per task.
  • Sonnet ($0.015/turn): Bug fixes, feature additions, code review. Under $1.00 per task.
  • Opus ($0.06/turn): Architecture decisions, security auditing, multi-file refactoring. Reserved for complex tasks, used maybe 2-3 times a week.

This alone cut my costs by 60% compared to using Sonnet for everything.

Layer 3: Session-Level Kill Switch

What AgentGuard does for agents in general, I do at the session level using Hermes cron. Every run tracks:

# Abort conditions
if session_turns > 15:
    kill_and_escalate("Too many turns")
if session_cost > 1.00:
    kill_and_escalate("Budget exceeded")
if no_progress_for(5_turns):
    kill_and_escalate("Agent is stuck in loop")

The key metric is cost per ticket over time. If it’s trending up, something is wrong — probably CLAUDE.md needs updating or the agent is fighting with recent changes.


The Real Cost of Not Watching Costs

I track every session cost in a daily CSV. After 4 weeks of running the autonomous pipeline, here’s what the data told me:

  • Average cost per ticket: $0.38
  • Most expensive single ticket: $1.42 (a security auth bug that needed Opus)
  • Cheapest tickets: $0.04 (Haiku + simple test addition)
  • Wasted spend (failed attempts): ~7% of total — acceptable
  • Monthly total: ~$35 — less than a SaaS subscription

Without these guardrails, I’d probably be spending 3x that. The agent doesn’t care about your API budget — it will happily iterate forever if you let it.


What I’d Do Differently

If I were starting today, I’d set up cost tracking on day one, not day 21. It takes 10 minutes and saves you from the “let me check my API dashboard… oh no” moment.

The tooling is getting better fast. AgentGuard, Veto, and others are building exactly this. But you don’t need a third-party tool — a simple script that tracks turns and costs per session, plus hard kill limits, covers 90% of the problem.

Your agent is a powerful tool. But like any tool, it needs a governor. Good prompt engineering and cost governance go hand in hand. Set one before you need one.


This is part of the ongoing series on building autonomous AI developer pipelines at susiloharjo.web.id. Follow me on X for updates.


Discover more from Susiloharjo

Subscribe to get the latest posts sent to your email.

Discover more from Susiloharjo

Subscribe now to keep reading and get access to the full archive.

Continue reading