What is AI Gateway?

AI Gateway is a unified API proxy that gives you a single endpoint for all your LLM needs. Use the familiar OpenAI SDK format to access OpenAI, Anthropic, Google, and more - with intelligent caching, automatic failover, and built-in observability.

Key Benefits

Unified API - One endpoint, one format, multiple providers
Response Caching - Reduce costs by up to 90% with intelligent caching
Automatic Failover - Seamless switching between providers when errors occur
Cost Tracking - Real-time visibility into LLM spending per model and user
Full Observability - Every request is logged for debugging and analytics

Why AI Gateway?

Building production AI applications requires more than just calling APIs:

Challenge	AI Gateway Solution
Provider lock-in	Switch providers without code changes
High costs	Cache identical requests automatically
Downtime	Automatic failover to backup providers
No visibility	Full request logging and analytics
Complex SDKs	Use OpenAI SDK format for all providers

How It Works

Your App → AI Gateway → OpenAI / Anthropic / Google
              ↓
          Caching Layer
              ↓
          Observability

Your app sends requests using the OpenAI SDK format
AI Gateway routes to the configured provider
Responses are cached based on your settings
Every request is logged for analytics

Supported Providers

Provider	Models	Status
OpenAI	GPT-4o, GPT-4-turbo, GPT-3.5, o1, o1-mini	Fully Supported
Anthropic	Claude Opus 4, Claude Sonnet 4, Claude 3.5 Sonnet, Claude 3 Haiku	Fully Supported
Google	Gemini 2.0, Gemini 1.5 Pro, Gemini 1.5 Flash	Coming Soon
AWS Bedrock	Claude, Llama, Titan	Coming Soon

Quick Example

Use the OpenAI SDK - just change the base URL and API key:

import OpenAI from 'openai';
 
const openai = new OpenAI({
  baseURL: 'https://api.transactional.dev/ai/v1',
  apiKey: 'gw_sk_your_gateway_key',
});
 
// Works with any provider you've configured!
const response = await openai.chat.completions.create({
  model: 'gpt-4o', // or 'claude-3-5-sonnet', 'gemini-pro', etc.
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});
 
console.log(response.choices[0].message.content);

Key Features

Response Caching

Identical requests are automatically cached, reducing costs and latency:

Configurable TTL (time-to-live)
Per-model cache settings
Cache hit/miss monitoring
Opt-out for non-deterministic requests

Provider Fallback

Configure backup providers for automatic failover:

Primary: OpenAI (gpt-4o)
    ↓ (if error)
Fallback 1: Anthropic (claude-3-5-sonnet)
    ↓ (if error)
Fallback 2: Google (gemini-pro)

Cost Management

Real-time cost tracking per request
Per-user and per-model breakdowns
Budget alerts when spending exceeds thresholds
Historical cost analytics

Streaming Support

Full SSE streaming support for real-time responses:

const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Tell me a story' }],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Comparison with Direct Provider Calls

Feature	Direct API	AI Gateway
Multiple providers	Multiple SDKs	Single SDK
Response caching	Build yourself	Built-in
Failover	Build yourself	Automatic
Cost tracking	Per-provider dashboards	Unified dashboard
Request logging	Build yourself	Automatic
Rate limiting	Per-provider	Unified

Getting Started

Add Provider Keys - Configure your OpenAI, Anthropic, or other API keys
Create Gateway Key - Generate a gateway API key (gw_sk_*)
Update Your Code - Point the OpenAI SDK to AI Gateway
Monitor - View requests and costs in the dashboard

Ready to get started? Check out our Quickstart Guide.

AI Gateway integrates seamlessly with our Observability platform:

All gateway requests are automatically traced
View requests alongside your application traces
Correlate LLM calls with user sessions
Debug issues with full request/response logging

Learn more about Observability.

Next Steps

Quickstart Guide - Get up and running in 5 minutes
Providers - Supported providers and models
Caching - Configure response caching
API Reference - Full endpoint documentation

Overview