Skip to content
Chitral Patil
← Writing
telemetry · cost · tools

Meter beats calculator

Calculators predict a best case you will rarely hit. A meter reads the truth off live telemetry. Why I built the meter instead of another spreadsheet.


I kept getting asked the same question in different costumes: what does it cost to serve this model? Every answer started with a spreadsheet, and every spreadsheet started with a lie — an assumed utilization number, typed in by a human who wanted the result to look good.

So I stopped building calculators and built a meter.

A calculator predicts. A meter observes.

A calculator takes your assumptions and multiplies them. Garbage in, confident garbage out. It will happily tell you a model costs USD 0.40 per million tokens because you told it the GPU runs at 90% utilization — a figure you have no way to defend.

A meter does the opposite. It watches the running server and reports what is actually happening: request rate, time-to-first-token, time-per-output-token, batch occupancy, KV cache pressure. From those, the effective cost falls out as a measurement, not a guess.

Meter beats calculator. Not because the math is fancier — because it refuses to assume the one number that matters most.

What the meter reads

vllm-cost-meter is a read-only observer. It never touches the inference path; it ingests Prometheus metrics a vLLM server already emits and turns them into a live cost-per-million-token readout:

  • throughput and request rate
  • TTFT, TPOT, and end-to-end latency
  • prompt and generation lengths
  • batch state and KV cache utilization

Near idle, I have watched the effective cost climb to 36.3× the saturated figure on the same hardware. No calculator will ever show you that number, because no one types "we run at 3% utilization at 4am" into a spreadsheet.

The bias I trust

I like tools that tell operators the truth, even when the truth is unflattering. A meter is harder to fool than a calculator because it is reporting reality instead of negotiating with it. That is the whole design philosophy: build the meter, not another calculator.