Vibe Coding In Indonesian Costs 50% More Tokens

I use AI coding tools daily. Most of the time I just describe what I want in plain language and let the model generate the code. You know the drill — “vibe coding.” But last week I noticed something: when I asked in Indonesian, the responses felt… heavier. More tokens in the chat, slower, hitting context limits faster.

So I measured it.

The test

I took 5 common coding requests — the kind you’d actually ask an LLM — and ran them through GPT-4’s tokenizer (`cl100k_base`). Same intent, same detail, just different language.


Prompt 1: "Buatkan fungsi Python untuk validasi email dengan regex"
Tokens: 16

Prompt 1: "Write a Python function to validate email with regex"
Tokens: 11
→ ID uses 45% more tokens


Prompt 2: "Tolong refactor kode ini biar lebih clean, pisahkan
logic database dari business logic, dan tambahkan error handling"
Tokens: 30

Prompt 2: "Please refactor this code to be cleaner, separate
database logic from business logic, and add error handling"
Tokens: 20
→ ID uses 50% more tokens


Prompt 3 — full vibe coding session:
"Gue punya API endpoint yang lambat. Query PostgreSQL pakai
JOIN 3 tabel, data 2 juta row. Bantu optimasi — kasih tau
index apa yang perlu ditambah, dan rewrite query-nya."
Tokens: 54

"I have a slow API endpoint. PostgreSQL query with 3-table JOIN,
2 million rows. Help me optimize — tell me what indexes to add,
and rewrite the query."
Tokens: 36
→ ID uses 50% more tokens

The pattern was consistent across every test: Indonesian prompts burn 40-60% more tokens than equivalent English.

Why your tokenizer hates bahasa Indonesia

LLM tokenizers are trained on web-scale data that’s overwhelmingly English. Common coding words like `function`, `query`, `refactor` get their own dedicated token IDs because they appear millions of times in training data.

Indonesian words? The tokenizer has to break them apart. `Buatkan` becomes `Bu`, `at`, `kan`. `Validasi` becomes `Val`, `id`, `asi`. Every fragment is a separate token.

The sentence “Buatkan fungsi Python untuk validasi email” (8 words) = 16 tokens. “Write a Python function to validate email” (8 words) = 11 tokens. Same information, 45% more expensive.

What this costs you over a month

Let’s say you do 20 vibe coding sessions per day, average 200 words per prompt:

– Indonesian: ~5,200 tokens/day

– English: ~3,500 tokens/day

Over a month: 51,000 extra tokens just from language choice. With pay-per-token APIs like GPT-4 ($0.03/1K input), that’s ~$1.50. Small number, but multiply across a dev team of 10 people using AI coding tools daily and suddenly it’s visible.

But the bigger hit is context window. Indonesian fills your context 50% faster, meaning the LLM loses track of earlier messages sooner. You hit limits faster, get worse responses, waste output tokens on reprompting.

My rule now

I still think in Indonesian. But when I type into an LLM? English. Every time. Comments, prompts, function names — all English.

It’s not about being less Indonesian. It’s about the tool you’re using. If the tokenizer doesn’t speak your language efficiently, adapt.

Here’s the script if you want to test your own prompts:


import tiktoken

enc = tiktoken.get_encoding("cl100k_base")

prompts = [
    ("Buatkan REST API endpoint dengan FastAPI untuk CRUD user",
     "Create a REST API endpoint with FastAPI for user CRUD"),
    ("Kenapa query PostgreSQL gue lambat? Ada 2 juta row, udah pakai index",
     "Why is my PostgreSQL query slow? 2 million rows, already indexed"),
    ("Refactor kode Angular ini, pisahin component dan service-nya",
     "Refactor this Angular code, separate components and services"),
]

for id_text, en_text in prompts:
    id_tok = len(enc.encode(id_text))
    en_tok = len(enc.encode(en_text))
    print(f"ID: {id_tok} tokens, EN: {en_tok} tokens -> {id_tok/en_tok:.2f}x")

Run it on the prompts you actually use. The numbers are probably worse than you think.

Discover more from Susiloharjo

Subscribe to get the latest posts sent to your email.