Overview
Unified API gateway for OpenAI, Anthropic, and other LLM providers with caching, fallback, and observability built-in.
What is AI Gateway?
AI Gateway is a unified API proxy that gives you a single endpoint for all your LLM needs. Use the familiar OpenAI SDK format to access OpenAI, Anthropic, Google, and more - with intelligent caching, automatic failover, and built-in observability.
Key Benefits
- Unified API - One endpoint, one format, multiple providers
- Response Caching - Reduce costs by up to 90% with intelligent caching
- Automatic Failover - Seamless switching between providers when errors occur
- Cost Tracking - Real-time visibility into LLM spending per model and user
- Full Observability - Every request is logged for debugging and analytics
Why AI Gateway?
Building production AI applications requires more than just calling APIs:
| Challenge | AI Gateway Solution |
|---|---|
| Provider lock-in | Switch providers without code changes |
| High costs | Cache identical requests automatically |
| Downtime | Automatic failover to backup providers |
| No visibility | Full request logging and analytics |
| Complex SDKs | Use OpenAI SDK format for all providers |
How It Works
Your App → AI Gateway → OpenAI / Anthropic / Google
↓
Caching Layer
↓
Observability
- Your app sends requests using the OpenAI SDK format
- AI Gateway routes to the configured provider
- Responses are cached based on your settings
- Every request is logged for analytics
Supported Providers
| Provider | Models | Status |
|---|---|---|
| OpenAI | GPT-4o, GPT-4-turbo, GPT-3.5, o1, o1-mini | Fully Supported |
| Anthropic | Claude Opus 4, Claude Sonnet 4, Claude 3.5 Sonnet, Claude 3 Haiku | Fully Supported |
| Gemini 2.0, Gemini 1.5 Pro, Gemini 1.5 Flash | Coming Soon | |
| AWS Bedrock | Claude, Llama, Titan | Coming Soon |
Quick Example
Use the OpenAI SDK - just change the base URL and API key:
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'https://api.transactional.dev/ai/v1',
apiKey: 'gw_sk_your_gateway_key',
});
// Works with any provider you've configured!
const response = await openai.chat.completions.create({
model: 'gpt-4o', // or 'claude-3-5-sonnet', 'gemini-pro', etc.
messages: [
{ role: 'user', content: 'Hello!' }
],
});
console.log(response.choices[0].message.content);Key Features
Response Caching
Identical requests are automatically cached, reducing costs and latency:
- Configurable TTL (time-to-live)
- Per-model cache settings
- Cache hit/miss monitoring
- Opt-out for non-deterministic requests
Provider Fallback
Configure backup providers for automatic failover:
Primary: OpenAI (gpt-4o)
↓ (if error)
Fallback 1: Anthropic (claude-3-5-sonnet)
↓ (if error)
Fallback 2: Google (gemini-pro)
Cost Management
- Real-time cost tracking per request
- Per-user and per-model breakdowns
- Budget alerts when spending exceeds thresholds
- Historical cost analytics
Streaming Support
Full SSE streaming support for real-time responses:
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}Comparison with Direct Provider Calls
| Feature | Direct API | AI Gateway |
|---|---|---|
| Multiple providers | Multiple SDKs | Single SDK |
| Response caching | Build yourself | Built-in |
| Failover | Build yourself | Automatic |
| Cost tracking | Per-provider dashboards | Unified dashboard |
| Request logging | Build yourself | Automatic |
| Rate limiting | Per-provider | Unified |
Getting Started
- Add Provider Keys - Configure your OpenAI, Anthropic, or other API keys
- Create Gateway Key - Generate a gateway API key (
gw_sk_*) - Update Your Code - Point the OpenAI SDK to AI Gateway
- Monitor - View requests and costs in the dashboard
Ready to get started? Check out our Quickstart Guide.
Related Features
AI Gateway integrates seamlessly with our Observability platform:
- All gateway requests are automatically traced
- View requests alongside your application traces
- Correlate LLM calls with user sessions
- Debug issues with full request/response logging
Learn more about Observability.
Next Steps
- Quickstart Guide - Get up and running in 5 minutes
- Providers - Supported providers and models
- Caching - Configure response caching
- API Reference - Full endpoint documentation