CodeBurn: Finally a Tool That Shows You Where Your AI Coding Budget Actually Goes

1,500 stars in roughly two days. That's not manufactured hype — that's developers recognizing a real itch getting scratched. If you've been using Claude Code, Codex, or Cursor seriously for more than a few weeks, you've probably had the moment: the bill lands and you have no idea what you actually spent it on. CodeBurn is an attempt to fix that.

What It Actually Does

CodeBurn is a CLI tool that reads your AI coding tool session data directly off disk — no wrappers, no proxies, no API key required — and renders an interactive terminal dashboard showing you where your tokens went. It parses Claude Code's JSONL session files, Codex's session directory, and Cursor's local SQLite database, then classifies activity into 13 task categories (coding, debugging, refactoring, git ops, etc.), breaks down costs by model, project, and tool, and even calculates a "one-shot rate" — how often the AI got it right on the first edit without needing a retry cycle.

The TUI is built with Ink (React for terminals), so it's actually navigable. Arrow keys switch time windows, p toggles between providers if you use multiple, and the charts render with gradients. It's not a wall of JSON — it's something you'd actually keep open.

Why This Matters Right Now

The timing here is the whole story. AI coding tools went from experimental to essential for a lot of teams in the past year, but the cost observability story has been essentially nonexistent. Claude Code charges per token. Cursor's auto mode hides which model it's even using. Codex has its own session format. None of these tools give you a unified view of what you're actually spending or on what.

Enterprise teams have started asking hard questions about AI coding ROI. Individual developers on pay-as-you-go plans are getting surprised by bills. The gap between "I use AI coding tools" and "I understand my AI coding spend" has been wide open, and no first-party tooling has filled it. CodeBurn is an independent attempt to do exactly that.

The community response — 106 forks in the same window as those 1,500 stars — suggests people aren't just starring it out of curiosity. They're looking at the code.

Features Worth Calling Out

Zero-instrumentation data collection. This is the design decision that makes the whole thing practical. CodeBurn doesn't ask you to change how you work. It reads existing session files that Claude Code, Codex, and Cursor already write to disk. There's no wrapper to install, no proxy to route through, no risk of breaking your existing setup. For Cursor specifically, it reads the local SQLite database at ~/Library/Application Support/Cursor/User/globalStorage/state.vscdb. The tradeoff is that your data never leaves your machine, which is actually a feature.

Activity classification without LLM calls. The 13-category classifier is fully deterministic — it looks at tool usage patterns and keyword matching in user messages. No round-trips to an API to understand what you were doing. This matters for trust: you can read the classification logic yourself and understand why a session got tagged as "Debugging" vs "Refactoring." It's not a black box.

One-shot rate tracking. This one's clever. For edit-heavy activities, CodeBurn detects Edit → Bash → Edit retry cycles and calculates what percentage of edit turns succeeded without needing a fix loop. A 90% one-shot rate on Coding means the AI got it right on the first try 9 out of 10 times. This is actually useful signal for evaluating whether a more expensive model is worth the cost for certain task types.

Multi-provider support with a plugin architecture. Claude Code, Codex, and Cursor are supported today. The provider plugin system is documented and the README points you at src/providers/codex.ts as a reference implementation. The commit history shows Cursor was added in v0.5.0 just two days ago, which means the plugin system is already being exercised. If you're using a tool that's not supported yet, the path to adding it is a single file.

Currency support and a macOS menu bar widget. These feel like nice-to-haves until you're on a non-USD plan or you just want a persistent cost indicator without opening a terminal. The menu bar widget requires SwiftBar, which is a reasonable dependency. Exchange rates pull from the European Central Bank via Frankfurter — no API key, cached 24 hours.

Who Should Use This

Individual developers on pay-as-you-go AI coding plans are the obvious primary audience. If you're spending real money on Claude Code or Codex and you want to understand the ROI — which projects, which task types, which models — this gives you that visibility in about 30 seconds.

Team leads trying to justify or right-size AI coding budgets will find the per-project and per-model breakdowns useful for conversations with finance or engineering leadership. The CSV/JSON export means you can get the data into whatever reporting tool you already use.

Developers who use multiple AI coding tools and want a unified view. The provider toggle in the dashboard is genuinely useful if you're context-switching between Claude Code for some tasks and Cursor for others.

Who probably shouldn't bother right now: Windows users — the Cursor SQLite path is hardcoded to macOS's ~/Library/Application Support/ and there's no mention of Windows support anywhere in the README. Teams using Cursor's business/enterprise plan where billing is centralized — the local SQLite database may not have the data you're looking for. And if you're primarily a Cursor user expecting granular tool-call breakdowns, you won't get that — Cursor doesn't log individual tool calls, so the Languages panel is the best available proxy.

Honest Concerns

This repo is two days old. I want to be direct about that. The commit timestamps show the repo was created April 13th and v0.5.0 landed April 15th. The 70-commit history from a single author suggests this was developed privately before being open-sourced, but there's no track record of maintenance, no release tags (the latest release section is empty), and the contributor base is tiny. Stars are a measure of interest, not stability.

The Cursor first-run problem is real. The commit message literally says "First run parses the 21GB DB (slow, ~40-80s)." That's a significant UX cliff. The file-based result cache means subsequent runs are fast, but 40-80 seconds on first run is the kind of thing that makes people think something is broken. It's documented, but it needs a better progress indicator.

Cursor cost estimates are rough. Since Cursor's Auto mode hides the actual model, costs are estimated using Sonnet pricing and labeled "Auto (Sonnet est.)" in the dashboard. This is the honest approach — it's better than pretending you know — but it means your Cursor cost numbers could be meaningfully off depending on what models Cursor is actually routing to.

No Windows support visible. The macOS path assumptions are baked in for Cursor and the menu bar widget is SwiftBar-only. Linux support for Claude Code and Codex likely works (XDG paths), but I wouldn't assume it without testing.

The activity classification will misfire. Keyword matching is fast and transparent, but it's not precise. "What if" in a message triggers Brainstorming. A file with a .py extension in a code block contributes to the Python language count. These heuristics are reasonable starting points, but they're heuristics. Don't treat the category breakdowns as ground truth.

Verdict

Install it, run it, see if it tells you something useful. The install is npx codeburn — there's no reason not to try it. If you use Claude Code or Codex, the data parsing is solid and the dashboard is genuinely informative. If you're primarily a Cursor user, temper your expectations: you'll get aggregate cost estimates and language breakdowns, not the granular tool-call visibility you'd get from Claude Code.

The bigger question is whether this gets maintained. A two-day-old repo with 1,500 stars and a single primary author is a common pattern for tools that get abandoned after the initial wave of attention. The plugin architecture and the quality of the commit messages suggest the author is thinking about the long-term shape of the project, but I'd want to see a few months of issue responses and releases before I'd depend on this in any serious workflow.

For now: worth your 30 seconds to try, worth watching if you care about AI coding cost observability, and worth contributing to if you have a provider you want to add. Just don't build anything critical on top of it yet.

AgentSeal/codeburn on GitHub

CodeBurn: Finally a Tool That Shows You Where Your AI Coding Budget Actually Goes

CodeBurn: Finally a Tool That Shows You Where Your AI Coding Budget Actually Goes

What It Actually Does

Why This Matters Right Now

Features Worth Calling Out

Who Should Use This

Honest Concerns

Verdict

More Reviews