Tracing AI Agents
Best practices for observing multi-step AI agent workflows.
Overview
AI agents are autonomous systems that use LLMs to plan, execute tools, and iterate toward goals. Proper tracing is essential for debugging, optimization, and understanding agent behavior.
Agent Architecture
User Request
↓
[Agent Loop]
├── Think: Plan next action
├── Act: Execute tool/action
├── Observe: Process result
└── Repeat until done
↓
Final Response
Tracing Agent Loops
Basic Agent Trace
import { getObservability } from '@transactional/observability';
async function runAgent(goal: string): Promise<string> {
const obs = getObservability();
// Create main trace for entire agent run
const trace = obs.trace({
name: 'agent-run',
input: { goal },
metadata: {
agentType: 'react',
maxIterations: 10,
},
});
let iteration = 0;
const maxIterations = 10;
try {
while (iteration < maxIterations) {
iteration++;
// Span for this iteration
const iterationSpan = obs.observation({
type: 'SPAN',
name: `iteration-${iteration}`,
input: { iteration },
});
// Think: LLM decides next action
const thinkGeneration = obs.generation({
name: 'think',
modelName: 'gpt-4o',
parentObservationId: iterationSpan.id,
input: { prompt: buildPrompt(goal, history) },
});
const decision = await llm.decide();
await thinkGeneration.end({
output: decision,
promptTokens: decision.usage.prompt_tokens,
completionTokens: decision.usage.completion_tokens,
});
// Check if done
if (decision.action === 'finish') {
await iterationSpan.end({
output: { status: 'complete', result: decision.result },
});
break;
}
// Act: Execute tool
const actionSpan = obs.observation({
type: 'SPAN',
name: `action-${decision.action}`,
parentObservationId: iterationSpan.id,
input: { tool: decision.action, args: decision.args },
});
const result = await executeTool(decision.action, decision.args);
await actionSpan.end({
output: { result },
});
await iterationSpan.end({
output: {
action: decision.action,
result: result,
},
});
}
const finalResult = synthesizeResult(history);
await trace.end({
output: {
result: finalResult,
iterations: iteration,
toolsUsed: getToolsUsed(history),
},
});
return finalResult;
} catch (error) {
await trace.error(error as Error);
throw error;
}
}Tool Execution Tracking
Individual Tool Traces
async function executeTool(name: string, args: any): Promise<any> {
const obs = getObservability();
const span = obs.observation({
type: 'SPAN',
name: `tool-${name}`,
input: { tool: name, args },
metadata: {
toolCategory: getToolCategory(name),
},
});
try {
const startTime = Date.now();
const result = await tools[name](args);
const duration = Date.now() - startTime;
await span.end({
output: {
success: true,
result,
duration,
},
});
return result;
} catch (error) {
await span.end({
output: {
success: false,
error: error.message,
},
});
throw error;
}
}Tool Categories
Track tools by category:
const toolCategories = {
search: ['web_search', 'file_search', 'code_search'],
action: ['write_file', 'run_code', 'send_email'],
retrieval: ['read_file', 'fetch_url', 'query_db'],
};
const span = obs.observation({
type: 'SPAN',
name: `tool-${toolName}`,
metadata: {
category: getCategory(toolName),
isDestructive: isDestructive(toolName),
requiresApproval: requiresApproval(toolName),
},
});Reasoning Visualization
Thought Process Tracking
Capture agent reasoning:
const thinkGeneration = obs.generation({
name: 'think',
modelName: 'gpt-4o',
input: {
systemPrompt,
userGoal: goal,
history: formatHistory(history),
availableTools: tools.map(t => t.name),
},
metadata: {
thoughtType: 'planning', // or 'reflection', 'correction'
},
});Decision Logging
Track decision points:
await thinkGeneration.end({
output: {
thought: decision.thought,
action: decision.action,
actionInput: decision.actionInput,
confidence: decision.confidence,
alternatives: decision.alternatives, // Other considered actions
},
});Multi-Agent Systems
Agent Coordination
Trace multi-agent interactions:
async function runMultiAgent(task: string) {
const obs = getObservability();
const trace = obs.trace({
name: 'multi-agent-task',
input: { task },
metadata: {
agents: ['planner', 'executor', 'reviewer'],
},
});
// Planner agent
const plannerSpan = obs.observation({
type: 'SPAN',
name: 'agent-planner',
input: { task },
});
const plan = await plannerAgent.plan(task);
await plannerSpan.end({
output: { plan },
});
// Executor agent
const executorSpan = obs.observation({
type: 'SPAN',
name: 'agent-executor',
input: { plan },
});
const result = await executorAgent.execute(plan);
await executorSpan.end({
output: { result },
});
// Reviewer agent
const reviewerSpan = obs.observation({
type: 'SPAN',
name: 'agent-reviewer',
input: { task, result },
});
const feedback = await reviewerAgent.review(task, result);
await reviewerSpan.end({
output: { feedback, approved: feedback.approved },
});
await trace.end({
output: {
result,
feedback,
approved: feedback.approved,
},
});
}Error Recovery
Tracking Retries
Monitor error recovery:
async function executeWithRetry(action: string, args: any, maxRetries = 3) {
const obs = getObservability();
for (let attempt = 1; attempt <= maxRetries; attempt++) {
const span = obs.observation({
type: 'SPAN',
name: `action-attempt-${attempt}`,
input: { action, args, attempt },
});
try {
const result = await executeTool(action, args);
await span.end({ output: { success: true, result } });
return result;
} catch (error) {
await span.end({
output: {
success: false,
error: error.message,
willRetry: attempt < maxRetries,
},
});
if (attempt === maxRetries) throw error;
}
}
}Self-Correction
Track when agent corrects itself:
if (toolResult.error) {
const correctionGeneration = obs.generation({
name: 'self-correction',
modelName: 'gpt-4o',
input: {
previousAction: lastAction,
error: toolResult.error,
goal,
},
metadata: {
correctionType: 'error-recovery',
},
});
const correction = await llm.analyzeError(toolResult.error);
await correctionGeneration.end({
output: {
analysis: correction.analysis,
newStrategy: correction.newStrategy,
},
});
}Performance Monitoring
Agent Efficiency Metrics
Track agent performance:
await trace.end({
output: { result },
metadata: {
iterations: iterationCount,
toolCalls: toolCallCount,
totalTokens: totalTokens,
totalCost: totalCost,
efficiency: result ? 'success' : 'failure',
timeToFirstAction: timeToFirstAction,
totalDuration: totalDuration,
},
});Alerts
Set up alerts for:
- Max iterations exceeded
- Stuck loops (same action repeated)
- High token usage
- Frequent failures
Best Practices
1. Hierarchical Traces
Trace: agent-run
├── Span: iteration-1
│ ├── Generation: think
│ └── Span: tool-search
├── Span: iteration-2
│ ├── Generation: think
│ └── Span: tool-write
└── ...
2. Rich Metadata
trace({
name: 'agent-run',
metadata: {
agentVersion: '2.0',
modelConfig: { temperature: 0.7 },
toolsAvailable: toolNames,
maxBudget: { iterations: 10, tokens: 50000 },
},
});3. Track All Decisions
Every LLM call should be a generation:
- Planning decisions
- Tool selection
- Error analysis
- Final synthesis
4. Monitor Resource Usage
const resourceMonitor = {
tokensUsed: 0,
toolCalls: 0,
iterations: 0,
checkBudget: () => {
if (this.tokensUsed > 50000) return 'token_limit';
if (this.iterations > 10) return 'iteration_limit';
return 'ok';
},
};Next Steps
- RAG Guide - Tracing RAG pipelines
- Production Guide - Production best practices
- Spans - Custom spans
On This Page
- Overview
- Agent Architecture
- Tracing Agent Loops
- Basic Agent Trace
- Tool Execution Tracking
- Individual Tool Traces
- Tool Categories
- Reasoning Visualization
- Thought Process Tracking
- Decision Logging
- Multi-Agent Systems
- Agent Coordination
- Error Recovery
- Tracking Retries
- Self-Correction
- Performance Monitoring
- Agent Efficiency Metrics
- Alerts
- Best Practices
- 1. Hierarchical Traces
- 2. Rich Metadata
- 3. Track All Decisions
- 4. Monitor Resource Usage
- Next Steps