Debugging Agents
Systematic approaches to finding and fixing agent failures without going insane.
The Nature of Agent Bugs
Agent bugs differ from traditional software bugs. They're often probabilistic (happens 30% of the time), emergent (only appears in multi-step sequences), and hard to reproduce deterministically.
This requires a different debugging methodology.
Classification First
Before debugging, classify the failure:
- Prompt bug — the model misunderstood its instructions
- Tool bug — a tool returned wrong results or failed
- Logic bug — the agent's planning logic is flawed
- Data bug — the input data was malformed or unexpected
- Integration bug — two correctly-working components fail when combined
Each class has different fixes. Mixing them up wastes time.
The Minimal Reproduction
Find the smallest input that reliably triggers the failure. Start with the full failing case and systematically simplify:
- Remove unnecessary context
- Shorten the message history
- Swap complex tools for stubs
When you have a minimal reproduction, the fix is usually obvious.
Prompt Bug Fixes
Most agent bugs are prompt bugs. Diagnosis:
- Run the prompt in isolation with the exact inputs that caused the failure
- Ask the model to explain its reasoning ("Before answering, explain step by step why you're making this decision")
- The explanation usually reveals the misunderstanding
Fix: add explicit instructions for the failure case, or add a few-shot example.
Tool Call Debugging
Log every tool call input and output. When a tool fails:
- Was the input well-formed?
- Did the tool return what the model expected?
- Did the model correctly parse the tool output?
Each of these is a separate failure mode.
The Systematic Approach
Use the systematic-debugging skill for a structured 6-step debugging workflow. Never debug by intuition alone on agent failures — the state space is too large.