The Error That Traditional Monitoring Misses

Consider a common scenario: an AI support bot starts telling customers their invoices are overdue when they are not. No errors in the logs. No exceptions thrown. The model returns valid JSON with a 200 status code. It is just wrong.

In many cases, these issues are discovered because a customer reports them -- not because monitoring catches them, not because an error tracker fires, but because a human notices.

This is exactly the kind of failure that error tracking needs to understand for AI applications.

AI-Native Error Tracking

7AI-Specific Error Types Detected

5xFaster Incident Resolution

0Code Changes for Gateway Users

4Alert Channels (Email, Slack, PagerDuty, Webhooks)

Why We Built Another Error Tracker

Sentry is good. Bugsnag is good. They catch exceptions, group them by stack trace, and show you where things broke. For traditional applications, that works.

But AI applications have failure modes that never throw exceptions:

Hallucinations: The model confidently states something false
Format violations: The model ignores your output schema
Context window overflow: The prompt is silently truncated
Quality degradation: Responses get worse over time without any code change
Cost spikes: A single request burns $2 worth of tokens due to a prompt bug
Stale cache hits: Cached responses become incorrect as underlying data changes

None of these produce a stack trace. None of them trigger a try/catch. All of them break your application.

We built error tracking that catches all of them.

What Makes It Different

AI-Specific Error Types

Beyond standard exceptions, we detect and categorize AI-specific errors:

// Standard errors (what traditional trackers catch)
- RuntimeError, TypeError, NetworkError
- HTTP 4xx/5xx from providers
- Timeout errors
 
// AI-specific errors (what we add)
- HallucinationDetected     // Response contradicts provided context
- FormatViolation           // Response does not match expected schema
- QualityBelowThreshold     // Quality score dropped below configured minimum
- CostAnomaly              // Request cost exceeds expected range
- TokenLimitApproached      // Prompt is within 10% of context window
- PromptInjectionAttempt    // Detected manipulation in user input
- GroundingFailure          // Response not supported by retrieved documents

Each error type has its own detection logic. Hallucination detection compares the response against the provided context using semantic similarity. Format violations validate against your defined output schema. Cost anomalies flag requests that cost more than 3x the rolling average.

Semantic Grouping

Traditional error trackers group errors by stack trace. Two errors with the same stack trace are the same error. Simple.

But AI errors do not have meaningful stack traces. A hallucination and a correct response follow the exact same code path. The stack trace is identical.

We group errors by semantic similarity instead. If 50 users get a hallucinated response about the same topic, those 50 errors are grouped together even though the exact text is different. The grouping considers:

Error type
Model and provider
Prompt version
Semantic similarity of the input
Semantic similarity of the output

This means your error dashboard shows "Support bot hallucinating about billing dates (47 occurrences)" instead of 47 separate error entries with identical stack traces.

LLM Trace Integration

Every error links directly to its LLM trace. Click an error and you see:

The exact prompt that was sent
The full response that was returned
Token counts and cost
Quality scores across all dimensions
The prompt version that was active
Whether the response was served from cache

This is the context that traditional error trackers cannot provide. When you see a hallucination error, you do not just know it happened -- you see exactly what the model was asked, what it said, and why it was wrong.

import Transactional from "@transactional/sdk";
 
const client = new Transactional({ apiKey: "tx_live_..." });
 
// Errors are automatically captured and linked to traces
const response = await client.ai.chat({
  model: "anthropic/claude-sonnet-4-20250514",
  messages: [...],
  validation: {
    schema: myOutputSchema,           // Validate response format
    qualityThreshold: 0.85,           // Flag low-quality responses
    costThreshold: 0.05,              // Flag expensive requests
    groundingContext: retrievedDocs   // Check for hallucinations
  }
});

Real-Time Alerting With Context

Alerts include the context you need to act immediately:

What type of error occurred
How many users are affected
Which model and prompt version
A representative example with full trace
Whether it started after a prompt change or provider issue

Alert channels: email, Slack, PagerDuty, webhooks. Configure severity levels per error type. Hallucinations might be critical. Format violations might be warnings. You decide.

Quick Setup

With AI Gateway (Automatic)

If you use our AI Gateway, error tracking is built in. Enable it in the dashboard and configure your detection thresholds:

// In your AI Gateway settings
{
  errorTracking: {
    enabled: true,
    detectHallucinations: true,
    qualityThreshold: 0.85,
    costAnomalyMultiplier: 3.0,
    formatValidation: true,
    alertChannels: ["slack", "email"]
  }
}

With Direct Provider Calls (SDK)

Add our error tracking middleware to any LLM client:

import { withErrorTracking } from "@transactional/sdk/errors";
 
const trackedClient = withErrorTracking(openaiClient, {
  apiKey: "tx_live_...",
  qualityThreshold: 0.85,
  outputSchema: mySchema,
  onError: (error) => {
    // Optional local error handler
    console.error(`AI error: ${error.type} - ${error.message}`);
  }
});

Dashboard Overview

The error tracking dashboard is built around three views:

Error Feed shows errors in real-time, grouped semantically. Each group shows the error type, occurrence count, affected users, first/last seen, and status (new, acknowledged, resolved). Filter by error type, model, prompt version, severity, or time range.

Error Detail drills into a specific error group. See every occurrence, the full LLM trace for each, quality scores, and a timeline showing when the error started and how it is trending. If the error correlates with a prompt change or provider issue, we highlight that.

Trends shows error rates over time by type. Overlay with deployment events, prompt version changes, and provider incidents. This is where you see patterns: "hallucination rate jumped 3x after deploying prompt v4.1."

What This Changes

Without AI-aware error tracking, the typical process for finding AI bugs is:

Customer complains
Team searches logs for the request
Someone manually reads the LLM response
Someone decides if it was wrong
Team tries to figure out why
Team greps for similar cases

With AI-native error tracking:

Alert fires with full context
Click through to the trace
See exactly what went wrong and why
Fix the prompt or model configuration
Verify the fix in the quality dashboard

The time from "something is wrong" to "understanding the problem" goes from hours to seconds. The time from "problem understood" to "fix deployed" goes from days to minutes.

For AI applications, the error your users see is not an exception. It is a wrong answer delivered with confidence. Traditional error tracking was never built to catch that. This one was.

Explore the Error Tracking feature page to see the dashboard and start catching AI errors before your users do.

We Got Tired of Errors Reaching Users First. So We Built Error Tracking.

The Error That Traditional Monitoring Misses

Why We Built Another Error Tracker

What Makes It Different

AI-Specific Error Types

Semantic Grouping

LLM Trace Integration

Real-Time Alerting With Context

Quick Setup

With AI Gateway (Automatic)

With Direct Provider Calls (SDK)

Dashboard Overview

What This Changes

Sources & References

Related Posts

Your AI Agent Will Crash in Production. Plan for It.

Traditional APM Cannot Track AI Errors. Here is What We Built Instead.

Every Token, Every Trace, Every Dollar. Introducing LLM Observability.

YOUR AGENTS DESERVE
REAL INFRASTRUCTURE.

We Got Tired of Errors Reaching Users First. So We Built Error Tracking.

The Error That Traditional Monitoring Misses

Why We Built Another Error Tracker

What Makes It Different

AI-Specific Error Types

Semantic Grouping

LLM Trace Integration

Real-Time Alerting With Context

Quick Setup

With AI Gateway (Automatic)

With Direct Provider Calls (SDK)

Dashboard Overview

What This Changes

Sources & References

Related Posts

Your AI Agent Will Crash in Production. Plan for It.

Traditional APM Cannot Track AI Errors. Here is What We Built Instead.

Every Token, Every Trace, Every Dollar. Introducing LLM Observability.

YOUR AGENTS DESERVEREAL INFRASTRUCTURE.

YOUR AGENTS DESERVE
REAL INFRASTRUCTURE.