Flight recorder for production AI agents

Debug AI agents like production software.

AgentTrace turns messy agent runs into clean timelines, failure diagnostics, cost breakdowns, and replay-ready debugging sessions. Built for teams shipping tool-using AI agents that cannot afford mystery failures.

agenttrace/run_1042

MODEL

I'll inspect the repository and check CI status.

TOOL CALL

github.check_runs({ sha: "HEAD" })

ERROR

401 Unauthorized · repeated 2x

DIAGNOSIS

Validate GitHub credentials before execution and halt on auth failures.

?

Why did the agent call the same tool five times?

?

Which step burned the most latency or tokens?

?

Did the model ignore an error before giving a confident answer?

?

How do I show a client or teammate what actually happened?

Live MVP

Paste a trace. See the failure path.

This first version accepts generic JSON traces. The goal is demand validation: prove agent builders want an instant debugging surface before investing in SDKs, storage, teams, and replay infrastructure.

Input

Drop in an agent run

Events

9

Tool calls

3

Errors

2

Est. cost

$0.01

Timeline

run_agenttrace_demo_1042

claude-3.5-sonnet

user

#1

Find all open PRs and summarize failing CI.

assistant

#2 · 1210ms · 74 tokens

I'll inspect the repository and check CI status.

tool_call

#3 · 480ms

github.list_prs

{
  "repo": "acme/web-app",
  "state": "open"
}

tool_result

#4

github.list_prs

[
  {
    "number": 42,
    "title": "feat: billing portal"
  }
]

tool_call

#5 · 650ms

github.check_runs

{
  "repo": "acme/web-app",
  "sha": "HEAD"
}

tool_result

#6

github.check_runs

401 Unauthorized: bad credentials

tool_call

#7 · 640ms

github.check_runs

{
  "repo": "acme/web-app",
  "sha": "HEAD"
}

tool_result

#8

github.check_runs

401 Unauthorized: bad credentials

assistant

#9 · 1350ms · 112 tokens

CI data is unavailable right now, but the PR appears healthy.

Diagnosis

Failure pattern report

github.check_runs was called 2 times. Add retry limits and stop conditions.

Auth failure detected. Validate credentials before the agent starts and halt on 401/403 instead of retrying.

What it does

Purpose-built for tool-using agents.

Trace timeline

Prompts, model outputs, tool calls, tool results, errors, retries, latency, and token estimates in one readable sequence.

Failure detection

Flags auth failures, repeated tool loops, bad outputs, schema issues, ignored errors, and expensive dead ends.

Fix recommendations

Turns failed runs into concrete next actions: credential validation, stop conditions, tool schema hardening, and retry policies.

Framework neutral

Start with JSON uploads tonight; SDKs for LangChain, CrewAI, OpenAI, Claude, and custom agents can come after demand is proven.

Founding plan

$19/mo for early agent teams.

Founding access includes trace uploads, diagnostics, and early influence on SDK support. Checkout is live through Stripe Payment Links while demand is validated.

Start founding plan