Transactional

Overview

Unified API gateway for OpenAI, Anthropic, and other LLM providers with caching, fallback, and observability built-in.

What is AI Gateway?

AI Gateway is a unified API proxy that gives you a single endpoint for all your LLM needs. Use the familiar OpenAI SDK format to access OpenAI, Anthropic, Google, and more - with intelligent caching, automatic failover, and built-in observability.

Key Benefits

  • Unified API - One endpoint, one format, multiple providers
  • Response Caching - Reduce costs by up to 90% with intelligent caching
  • Automatic Failover - Seamless switching between providers when errors occur
  • Cost Tracking - Real-time visibility into LLM spending per model and user
  • Full Observability - Every request is logged for debugging and analytics

Why AI Gateway?

Building production AI applications requires more than just calling APIs:

ChallengeAI Gateway Solution
Provider lock-inSwitch providers without code changes
High costsCache identical requests automatically
DowntimeAutomatic failover to backup providers
No visibilityFull request logging and analytics
Complex SDKsUse OpenAI SDK format for all providers

How It Works

Your App → AI Gateway → OpenAI / Anthropic / Google
              ↓
          Caching Layer
              ↓
          Observability
  1. Your app sends requests using the OpenAI SDK format
  2. AI Gateway routes to the configured provider
  3. Responses are cached based on your settings
  4. Every request is logged for analytics

Supported Providers

ProviderModelsStatus
OpenAIGPT-4o, GPT-4-turbo, GPT-3.5, o1, o1-miniFully Supported
AnthropicClaude Opus 4, Claude Sonnet 4, Claude 3.5 Sonnet, Claude 3 HaikuFully Supported
GoogleGemini 2.0, Gemini 1.5 Pro, Gemini 1.5 FlashComing Soon
AWS BedrockClaude, Llama, TitanComing Soon

Quick Example

Use the OpenAI SDK - just change the base URL and API key:

import OpenAI from 'openai';
 
const openai = new OpenAI({
  baseURL: 'https://api.transactional.dev/ai/v1',
  apiKey: 'gw_sk_your_gateway_key',
});
 
// Works with any provider you've configured!
const response = await openai.chat.completions.create({
  model: 'gpt-4o', // or 'claude-3-5-sonnet', 'gemini-pro', etc.
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});
 
console.log(response.choices[0].message.content);

Key Features

Response Caching

Identical requests are automatically cached, reducing costs and latency:

  • Configurable TTL (time-to-live)
  • Per-model cache settings
  • Cache hit/miss monitoring
  • Opt-out for non-deterministic requests

Provider Fallback

Configure backup providers for automatic failover:

Primary: OpenAI (gpt-4o)
    ↓ (if error)
Fallback 1: Anthropic (claude-3-5-sonnet)
    ↓ (if error)
Fallback 2: Google (gemini-pro)

Cost Management

  • Real-time cost tracking per request
  • Per-user and per-model breakdowns
  • Budget alerts when spending exceeds thresholds
  • Historical cost analytics

Streaming Support

Full SSE streaming support for real-time responses:

const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Tell me a story' }],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Comparison with Direct Provider Calls

FeatureDirect APIAI Gateway
Multiple providersMultiple SDKsSingle SDK
Response cachingBuild yourselfBuilt-in
FailoverBuild yourselfAutomatic
Cost trackingPer-provider dashboardsUnified dashboard
Request loggingBuild yourselfAutomatic
Rate limitingPer-providerUnified

Getting Started

  1. Add Provider Keys - Configure your OpenAI, Anthropic, or other API keys
  2. Create Gateway Key - Generate a gateway API key (gw_sk_*)
  3. Update Your Code - Point the OpenAI SDK to AI Gateway
  4. Monitor - View requests and costs in the dashboard

Ready to get started? Check out our Quickstart Guide.

AI Gateway integrates seamlessly with our Observability platform:

  • All gateway requests are automatically traced
  • View requests alongside your application traces
  • Correlate LLM calls with user sessions
  • Debug issues with full request/response logging

Learn more about Observability.

Next Steps