Track Your LLM Costs in Real-Time Before They Surprise You
Set up real-time cost tracking for LLM API calls with token counting, dashboards, alert thresholds, and budget controls. Practical TypeScript examples included.
Transactional Team
Jan 24, 2026
>>
8 min read
Share
A common scenario: a team gets a $14,000 bill from OpenAI when they had budgeted $2,000. The culprit is a single endpoint passing entire conversation histories as context on every request, and nobody notices until the invoice arrives.
LLM costs are fundamentally different from traditional API costs. A single request can cost anywhere from $0.001 to $3.00 depending on the model, input size, and output length. Without real-time tracking, you are flying blind.
What You Will Learn
How to count tokens and calculate costs per provider
Building middleware for automatic cost tracking
Setting up alert thresholds and budget controls
Creating a cost dashboard with useful breakdowns
LLM Cost per 1M Input Tokens by Provider (Early 2026)
Claude Opus 4
15
Claude Sonnet 4
3
GPT-4o
2.5
GPT-4.1
2
Gemini 2.5 Pro
1.25
Claude Haiku 4.5
0.8
GPT-4o-mini
0.15
Token Counting Per Provider
Every provider charges differently. Here are the current rates for popular models as of early 2026:
Store usage records in your database and query them for dashboard views:
// Dashboard query examples// Total spend by model (last 30 days)const spendByModel = await db .select({ model: llmUsage.model, totalCost: sql<number>`SUM(cost)`, requestCount: sql<number>`COUNT(*)`, avgLatency: sql<number>`AVG(latency_ms)`, }) .from(llmUsage) .where(gte(llmUsage.timestamp, thirtyDaysAgo)) .groupBy(llmUsage.model) .orderBy(desc(sql`SUM(cost)`));// Cost trend by dayconst dailyCosts = await db .select({ date: sql<string>`DATE(timestamp)`, cost: sql<number>`SUM(cost)`, tokens: sql<number>`SUM(input_tokens + output_tokens)`, }) .from(llmUsage) .where(gte(llmUsage.timestamp, thirtyDaysAgo)) .groupBy(sql`DATE(timestamp)`) .orderBy(sql`DATE(timestamp)`);// Top expensive endpointsconst costByEndpoint = await db .select({ endpoint: llmUsage.endpoint, totalCost: sql<number>`SUM(cost)`, avgCost: sql<number>`AVG(cost)`, p99Cost: sql<number>`PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY cost)`, }) .from(llmUsage) .where(gte(llmUsage.timestamp, thirtyDaysAgo)) .groupBy(llmUsage.endpoint) .orderBy(desc(sql`SUM(cost)`));
The dashboard should show four things at a glance: current month spend vs budget, daily trend, breakdown by model, and top endpoints by cost.
Hard Budget Controls
Alerts are not enough. You need circuit breakers that actually stop spending:
async function trackedCompletionWithLimits( params: OpenAI.ChatCompletionCreateParams, endpoint: string): Promise<OpenAI.ChatCompletion> { // Check budget before making the call const currentMonthSpend = await getMonthlySpend(); if (currentMonthSpend >= MONTHLY_BUDGET) { throw new BudgetExceededError( `Monthly budget of $${MONTHLY_BUDGET} exhausted. Current: $${currentMonthSpend.toFixed(2)}` ); } // Estimate and check per-request limit const estimatedCost = estimateRequestCost(params); if (estimatedCost > MAX_COST_PER_REQUEST) { // Try to reduce cost: truncate context or use cheaper model params = reduceCost(params); } return trackedCompletion(params, endpoint);}
The Key Takeaway
LLM cost tracking is not optional. Build it in from day one, before you get the surprise invoice. The implementation is straightforward: count tokens, calculate costs, store records, set alerts, add circuit breakers.
If you want this handled out of the box, Transactional's LLM Observability gives you real-time cost dashboards, per-model breakdowns, and configurable budget alerts without writing the tracking infrastructure yourself. But even if you build it in-house, the patterns above will get you most of the way there.