Are We Using AI Wisely Yet?

Susiloharjo

Somewhere in the West, a server room is humming right now. The room is cold in a way that has nothing to do with weather. Somewhere downstream from that room, a river is running a little warmer than it was last summer. Somewhere near that river, a child is brushing her teeth. The water in her cup was clean this morning. By the end of the day, it might not be.

Read more

Homelab AI Agent Costs Down 60% with Ollama Quantized Models

My homelab AI agent setup was costing $42/month in API calls alone — until I switched to local quantized models.

That number came straight from my OpenRouter billing dashboard for April 2026: 350,000 tokens across Claude 3 Haiku and Mistral Small, mostly from my personal agent that checks GitHub notifications, drafts tweets, and summarizes my daily reading. At $0.00012 per 1K tokens for Haiku and $0.00006 for Mistral, the math added up fast. I’d told myself local LLMs weren’t ready for prime time — too slow, too finicky, too much VRAM — until I hit a psychological wall: paying for something I could run myself felt like renting a bicycle when I owned a garage full of parts.

I decided to fix it. Over three weekends, I rebuilt my agent pipeline around Ollama, quantized Llama 3 models, and deliberate GPU time-slicing. The result? My monthly LLM API spend dropped to $0. building practical AI agents for real-world automation — My agent still handles the same tasks — sometimes faster, sometimes slower, but consistently useful. Along the way I learned concrete lessons about optimizing AI agent performance with structured prompts, quantization tradeoffs, containerized GPU sharing, and why “local first” doesn’t mean “local only.”

Here’s exactly what I did, what I measured, and what broke along the way.

Read more

gRPC vs REST: When to Use Which

Most “gRPC vs REST” articles pick a side. gRPC is the future. REST is simpler. The reality is more nuanced: the right choice depends on who calls the API, how often, and what the consumers can actually debug at 2 AM. This article compares the two protocols side by side and walks through three production cases where the choice was not obvious from the start.

Read more

What Responsible AI Actually Means for Builders

Abstract dark code terminal representing AI guardrails and responsible deployment

Most “responsible AI” content reads like it was written by a policy team that has never deployed an agent to production. The checklists are long. The principles are abstract. And none of them tell you what to do when your agent starts hallucinating customer data at 3 AM and the on-call engineer is asleep.

I have been building AI agents for about a year now. Not research. Not demos. Actual agents that touch real data, make real decisions, and occasionally break things in ways I did not anticipate. Here is what responsible AI looks like from the builder’s side — not the policy side.

Read more