user
#1
Find all open PRs and summarize failing CI.
Flight recorder for production AI agents
AgentTrace turns messy agent runs into clean timelines, failure diagnostics, cost breakdowns, and replay-ready debugging sessions. Built for teams shipping tool-using AI agents that cannot afford mystery failures.
MODEL
I'll inspect the repository and check CI status.
TOOL CALL
github.check_runs({ sha: "HEAD" })
ERROR
401 Unauthorized · repeated 2x
DIAGNOSIS
Validate GitHub credentials before execution and halt on auth failures.
?
Why did the agent call the same tool five times?
?
Which step burned the most latency or tokens?
?
Did the model ignore an error before giving a confident answer?
?
How do I show a client or teammate what actually happened?
Live MVP
This first version accepts generic JSON traces. The goal is demand validation: prove agent builders want an instant debugging surface before investing in SDKs, storage, teams, and replay infrastructure.
Input
Events
9
Tool calls
3
Errors
2
Est. cost
$0.01
Timeline
claude-3.5-sonnet
user
#1
Find all open PRs and summarize failing CI.
assistant
#2 · 1210ms · 74 tokens
I'll inspect the repository and check CI status.
tool_call
#3 · 480ms
github.list_prs
{
"repo": "acme/web-app",
"state": "open"
}tool_result
#4
github.list_prs
[
{
"number": 42,
"title": "feat: billing portal"
}
]tool_call
#5 · 650ms
github.check_runs
{
"repo": "acme/web-app",
"sha": "HEAD"
}tool_result
#6
github.check_runs
401 Unauthorized: bad credentials
tool_call
#7 · 640ms
github.check_runs
{
"repo": "acme/web-app",
"sha": "HEAD"
}tool_result
#8
github.check_runs
401 Unauthorized: bad credentials
assistant
#9 · 1350ms · 112 tokens
CI data is unavailable right now, but the PR appears healthy.
Diagnosis
github.check_runs was called 2 times. Add retry limits and stop conditions.
Auth failure detected. Validate credentials before the agent starts and halt on 401/403 instead of retrying.
What it does
Prompts, model outputs, tool calls, tool results, errors, retries, latency, and token estimates in one readable sequence.
Flags auth failures, repeated tool loops, bad outputs, schema issues, ignored errors, and expensive dead ends.
Turns failed runs into concrete next actions: credential validation, stop conditions, tool schema hardening, and retry policies.
Start with JSON uploads tonight; SDKs for LangChain, CrewAI, OpenAI, Claude, and custom agents can come after demand is proven.
Founding plan
Founding access includes trace uploads, diagnostics, and early influence on SDK support. Checkout is live through Stripe Payment Links while demand is validated.