Overview
Playground for your agent evals to view experiments, evals, and compare runs.
Experiments
0
Total Runs
0
Eval Fixtures
0
Latest Pass Rate
—
Recent Experiments
View all →No experiments yet. Run
agent-eval to get started.Eval Fixtures
View all →No evals found. Create evals in your
evals/ directory.Compare
Compare two experiment runs side-by-side to see pass rate deltas, duration changes, and per-eval breakdowns.
Open Compare →