Overview

AI Gateway tracks the cost of every LLM request in real-time. View spending by model, user, time period, and more in the dashboard.

How Costs Are Calculated

Cost is calculated based on actual token usage:

Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)

Example

For a GPT-4o request:

Input: 500 tokens × $2.50/1M = $0.00125
Output: 200 tokens × $10.00/1M = $0.002
Total: $0.00325

Viewing Costs

Dashboard Overview

Navigate to AI Gateway > Analytics to see:

Total Spend: Current month's total
Cost by Model: Breakdown by model
Cost by Day: Daily spending trend
Top Users: Highest spending users

Per-Request Costs

Every request includes cost in the response:

{
  "id": "chatcmpl-123",
  "usage": {
    "prompt_tokens": 500,
    "completion_tokens": 200,
    "total_tokens": 700
  },
  "x_cost": {
    "input_cost": 0.00125,
    "output_cost": 0.002,
    "total_cost": 0.00325,
    "currency": "USD"
  }
}

Response Headers

X-Cost-Input: 0.00125
X-Cost-Output: 0.002
X-Cost-Total: 0.00325

Cost Breakdown

By Model

Model	Input ($/1M)	Output ($/1M)	Monthly Spend
gpt-4o	$2.50	$10.00	$1,234.56
gpt-4o-mini	$0.15	$0.60	$456.78
claude-3-5-sonnet	$3.00	$15.00	$789.00

By Time Period

View costs over different periods:

Today
This week
This month
Last 30 days
Custom range

By User

Track per-user spending when you include user in requests:

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [...],
  user: 'user-123',  // Track by user
});

Budget Alerts

Set up alerts when spending exceeds thresholds:

Configuring Alerts

Go to Settings > Alerts
Click Add Alert
Configure:
- Type: Budget Alert
- Threshold: $100 (daily), $1000 (monthly)
- Channel: Email, Slack, Webhook

Alert Types

Alert	Trigger
Daily Spend	Exceeds daily budget
Monthly Spend	Exceeds monthly budget
Cost Spike	50%+ increase from average
Model Cost	Specific model exceeds threshold

Cost Optimization

1. Use Caching

Enable caching to reduce duplicate requests:

Metric	Before Cache	After Cache
Requests	100,000	100,000
Cache Hit Rate	0%	60%
Billable Requests	100,000	40,000
Cost Savings	-	60%

2. Choose the Right Model

Match model capability to task requirements:

Task	Recommended Model	Cost/1M tokens
Simple Q&A	gpt-3.5-turbo	$0.50
Complex reasoning	gpt-4o	$2.50
Quick responses	claude-3-haiku	$0.25

3. Optimize Prompts

Reduce token usage with efficient prompts:

// Verbose (150 tokens)
const verbose = `Please analyze the following text and provide a
comprehensive summary that covers all the main points, themes,
and important details. Make sure to include...`;
 
// Concise (20 tokens)
const concise = `Summarize the key points of this text:`;

4. Set Max Tokens

Limit output length:

await openai.chat.completions.create({
  model: 'gpt-4o',
  max_tokens: 500,  // Limit output
  messages: [...],
});

5. Monitor Expensive Requests

Identify and optimize high-cost requests:

Go to Analytics > Requests
Sort by Cost (descending)
Analyze expensive requests
Optimize prompts or model selection

Export Cost Data

Export cost data for accounting:

CSV Export

Go to Analytics
Click Export > CSV
Select date range
Download file

API Export

curl https://api.transactional.dev/ai-gateway/costs \
  -H "Authorization: Bearer $GATEWAY_API_KEY" \
  -G --data-urlencode "from=2024-01-01" \
        --data-urlencode "to=2024-01-31" \
        --data-urlencode "format=csv"

Cost Reports

Weekly Summary

Receive weekly cost summaries via email:

Go to Settings > Notifications
Enable Weekly Cost Summary
Select recipients

Custom Reports

Create custom reports:

Go to Analytics > Reports
Click New Report
Configure:
- Metrics (cost, tokens, requests)
- Grouping (model, user, day)
- Filters (date range, models)
Schedule (daily, weekly, monthly)

Cost by Provider

When using fallback, costs vary by provider:

Request Path	Cost
OpenAI (primary)	$0.0025
Anthropic (fallback)	$0.0030
Google (fallback 2)	$0.0015

Track fallback costs separately in analytics.

Billing

How Billing Works

AI Gateway tracks costs in real-time
No markup on provider costs
Pay only for actual usage
Monthly invoice with detailed breakdown

Invoice Details

Your invoice includes:

Total requests
Token usage by model
Cost by provider
Cache savings

Next Steps

Caching - Reduce costs with caching
Rate Limiting - Control usage
Analytics Dashboard - View your costs

Cost Tracking