I Built An AI Junior Dev That Fixes ERP Tickets In My Sleep

May 27, 2026 · 6 min read

Series: AI Junior Developer — Part 1 of 6

I maintain a few ERP apps. Nothing fancy — inventory, finance, HR modules for a small company. But here’s the thing: I’m the only developer. When a user reports “laporan stok error pas filter tanggal,” I’m the one who drops everything, opens the codebase, traces the bug, fixes it, runs tests, and deploys.

That worked fine when there were 5 users. Now there are 50. And the tickets don’t stop coming.

Last week I counted: 12 bug tickets in 7 days. Half of them were small — null pointer here, date parsing issue there. Things a junior developer could handle in 20 minutes. But I’m not a junior developer anymore, and I don’t have one to delegate to.

So I built one.

The Idea

What if I had an AI agent that:

Monitors my ticket system (GitHub Issues or Jira) every 30 minutes
Reads the ticket, understands the bug
Dives into my codebase, finds the problematic code
Reproduces the bug first, then fixes it
Runs the test suite to make sure nothing breaks
Creates a branch, commits, pushes, opens a PR
Reports back to me on Telegram with the PR link

And most importantly: I only get involved at the PR review stage. Human-in-the-loop, always. The agent proposes, I approve.

This isn’t science fiction. Every piece of this stack exists and works today. I just needed to wire them together.

The Stack

After experimenting with a few combinations, here’s what I landed on:

Orchestrator: Hermes Agent

Hermes Agent is an open-source AI agent framework by Nous Research. Think of it as the “project manager” — it runs cron jobs, spawns subagents, handles messaging (Telegram, Discord, Slack), and delegates work. It’s the brain that decides what to do and when.

Why Hermes and not just a bash script with cron? Because Hermes understands context. When a ticket says “inventory report broken,” Hermes can read the ticket, assess whether it’s a code bug or a user question, and decide whether to wake up the code agent or just log it. A bash script can’t do that.

Code Executor: Claude Code CLI

Claude Code is Anthropic’s autonomous coding agent. It runs in your terminal, reads your entire codebase, writes code, runs shell commands, and manages git — all by itself. I use it in print mode (claude -p "fix this bug"), which means it runs one-shot, does the work, and exits.

This is where the actual coding happens. Claude Code reads the ERP codebase, traces the bug, writes the fix, and runs pytest. No human touches the keyboard.

The Glue: MCP (Model Context Protocol)

MCP is the standard for connecting AI agents to external tools. Through MCP, Claude Code can query my PostgreSQL database (read-only, for debugging), access GitHub (for creating PRs), and even pull Sentry stack traces if something’s on fire.

The Memory: CLAUDE.md

This is the secret sauce. CLAUDE.md is a file Claude Code reads at the start of every session. It contains:

Build commands (make test, make lint)
Architecture notes (/modules/inventory/ handles stock)
Gotchas — every mistake Claude made, captured as a rule

The creator of Claude Code, Boris Cherny, calls this “compounding engineering.” Every time Claude makes a mistake, you tell it “update CLAUDE.md so you don’t repeat this.” After a few weeks, the same prompts produce dramatically better output because Claude remembers every gotcha in your project.

How It Works (The 30-Second Version)

08:00 — User reports: "Laporan stok error pas filter tanggal" 08:30 — Hermes cron job detects new GitHub issue #142 08:31 — Hermes spawns Claude Code:          "Fix #142. Reproduce first. Write failing test. Fix. Run make test. Create PR." 08:35 — Claude Code: bug found (naive datetime vs timezone-aware). Fixed. Tests pass. 08:36 — PR #523 opened. Hermes sends Telegram notification. 08:37 — I wake up, see the PR, review, merge.

I haven’t written a single line of code for this fix.

The Agent Team (Multi-Agent Architecture)

Here’s where it gets interesting. One agent is an assistant. A team of agents is a department.

I designed four specialized agents, each with a different role:

Agent	Role	Tools
Supervisor	Reads tickets, triages, decides who does what	Hermes orchestrator
Coder	Investigates bug, writes fix, runs tests	Claude Code (Read, Edit, Bash)
Reviewer	Reviews PR in a fresh context, unbiased	Claude Code subagent (Read-only)
QC/Tester	Runs full test suite, checks edge cases	Claude Code (Read, Bash)

The flow: Supervisor detects ticket → assigns to Coder → Coder produces PR → Reviewer checks with fresh eyes → QC runs final verification → I review and merge.

This Writer/Reviewer pattern is something the Anthropic team swears by. The reviewer runs in a separate session with zero implementation bias, so it catches things the coder missed. Two Claudes, two contexts, one result.

What This Series Covers

This was the vision. The rest is implementation. Over the next few posts, I’ll walk through every piece of the stack in detail:

Post 2: Setting Up Hermes Agent — Installation, configuration, cron jobs, Telegram integration
Post 3: Setting Up Claude Code — CLAUDE.md mastery, skills, subagents, MCP servers
Post 4: Multi-Agent Architecture — Supervisor, coder, reviewer, QC. Human-in-the-loop workflow
Post 5: Autonomous Cron Pipeline — Auto-detect GitHub issues, assign to agents, handle failures
Post 6: Monitoring & Continuous Improvement — Cost tracking, compounding CLAUDE.md, weekly maintenance

Each post will have real commands, real config files, real gotchas. Nothing theoretical. If you want to build your own AI junior dev by the end of this series, you will.

The Cost

I run this on DeepSeek V4 Pro for the orchestrator and Claude Sonnet for the code agent. Total cost:

~$0.30 per ticket (Claude Sonnet, ~5 turns)
~$5/month for Hermes orchestrator
Roughly $30-50/month for 2-3 tickets per day

That’s less than a junior developer’s lunch budget. And it works 24/7.

The Real Shift

After spending two weeks building this system, I realized something. The value isn’t just “AI fixes bugs faster.” It’s that I stopped thinking about code and started thinking about systems.

I don’t debug inventory reports anymore. I design agent workflows. I write CLAUDE.md rules. I review PRs that an AI wrote based on a user’s bug report.

The setup is the work. The execution is verification.

Next post: wiring up Hermes Agent from scratch. See you tomorrow.

This is Post 1 of the “AI Junior Developer” series. Follow along at susiloharjo.web.id.

Discover more from Susiloharjo

Subscribe to get the latest posts sent to your email.