Claude Sonnet 5: Model Agentic Pertama yang Bisa Coding dan Tool Use Setara Opus

Susiloharjo

Claude Sonnet 5 baru saja dirilis Anthropic dengan klaim ambisius: ini adalah model Sonnet paling agentic yang pernah ada. Bisa membuat rencana, menggunakan tools seperti browser dan terminal, serta berjalan secara otonom di level yang beberapa bulan lalu hanya bisa dilakukan model besar dan mahal. Yang menarik, Sonnet 5 ini performanya mendekati Opus 4.8 — … Read more

TLS 1.3 Features. What Changed From TLS 1.2. And Why It Matters

Susiloharjo

TLS 1.3 shipped in 2018 as RFC 8446. It is the biggest overhaul of the TLS protocol since SSL 3.0 in 1996. Five years after release, TLS 1.3 now protects over 70 percent of HTTPS connections. The upgrade is not incremental. It changes how the handshake works. It removes broken cipher suites. It mandates forward secrecy. And it makes the protocol faster.

This post covers what TLS 1.3 adds. What it removes. And why the changes matter for anyone running a server or building a client.

Read more

SSL Is Dead. TLS Runs The Web. Here Is What Actually Encrypts Your Traffic

Susiloharjo

Every HTTPS connection on the modern web uses TLS. Not SSL. The padlock icon in browsers still says SSL. Hosting providers sell SSL certificates. Devops teams talk about SSL termination. But SSL has been dead since 2015. What actually protects traffic is TLS 1.2 or TLS 1.3.

This distinction matters. SSL and TLS are not the same protocol. They have different security properties. Different handshake flows. Different cipher suites. One of them has known vulnerabilities that attackers can exploit.

Read more

60 Percent of My API Calls Were Cached. I Turned It Off.

API caching performance

60 Percent of My API Calls Were Cached. I Turned It Off.

It is Tuesday afternoon. I am looking at my Grafana dashboard. The cache hit rate says 60 percent. Six out of ten API requests are being served from Redis, not the database.

By every metric I learned, this should be a win. Cache hits are fast. Database queries are slow. The math is simple.

But my p95 latency went up 40 milliseconds after I added caching.

Not down. Up.

I spent three days chasing this. I added more cache. I tuned TTLs. I pre-warmed the cache with likely queries. Nothing helped. The more I cached, the slower things got.

Then I found the bug. It wasn’t in the cache layer. It wasn’t in the database. It was in my assumptions about what caching actually does.

Read more

I Stopped Self-Hosting AI: Why DeepSeek V4 Pro on Ollama Cloud Is My New Default

Susiloharjo

I Stopped Self-Hosting AI: Why DeepSeek V4 Pro on Ollama Cloud Is My New Default

The most-said line in my group chats this week was three words: “I miss Fable.”

Not in a nostalgic way. In a “my entire workflow is broken” way.

Fable was the model I used for first-draft generation. Fast, cheap, good enough for 80 percent of the work. Then it vanished. No deprecation warning. No migration path. Just gone.

My first reaction was what a lot of people are doing now: go local. Buy a GPU, run llama.cpp, never depend on a vendor again. I spent $1,400 on a used RTX 4090. I downloaded 150GB of model weights. I learned to love the sound of my fans spinning at 80 percent.

For one month, self-hosting worked. Then the novelty wore off.

The 4090 draws 450W under load. My electricity bill went up $35. The 70B models I was running maxed out at 32K context — not enough for full codebase reviews. Batch processing hundreds of documents meant queuing jobs overnight. And when Opus 4.8 dropped with significantly better reasoning, I had no way to access it without going back to cloud anyway.

I was renting infrastructure, not avoiding vendors. The landlord just changed from Anthropic to NVIDIA.

Then I tried DeepSeek V4 Pro on Ollama Cloud. The pricing made me reconsider everything.

Read more