Zespan is an AI agent observability and engineering platform. It traces every agent decision, tool call, handoff, and delegation in production. It also provides prompt versioning, built-in LLM-as-judge evaluations, guardrails, cost optimization, and an AI ops assistant called ZespanPilot.

How do I instrument my AI agent with Zespan?

Zespan requires 2 lines of code. Import zespan and call zespan.init({ apiKey: process.env.LT_KEY }). This auto-patches OpenAI, Anthropic, Gemini, Bedrock, and Mistral. For framework-level tracing, add one handler: ZespanCallbackHandler for LangChain, ZespanCrewAIListener for CrewAI, or ZespanADKHandler for Google ADK.

Does Zespan support prompt versioning?

Yes. Zespan includes prompt management with versioning, a playground for iteration, and A/B testing to compare prompt versions against each other in production.

What evaluations does Zespan support?

Zespan ships 12 built-in LLM-as-judge evaluation templates including faithfulness, relevance, toxicity, groundedness, and more. Evaluations run automatically on every trace with no custom scoring functions required.

How does Zespan compare to Langfuse?

Zespan is agent-native: every span carries agent identity, delegations are first-class trace events, and an agent map is built automatically. Langfuse was built for LLM pipelines and extended to agents later. Zespan also ships 12 built-in eval templates (Langfuse has none), includes an AI cost optimizer, and ZespanPilot for AI ops. Langfuse has open-source self-hosting; Zespan does not.

What is the free tier for Zespan?

The free tier includes 10,000 traces per month, 14-day retention, 2 projects, and 1 seat. No credit card required.

Compare

Zespan vs BrainTrust

Zespan vs BrainTrust — full observability vs eval-first tooling.

BrainTrust is strong on evaluations and dataset management — a good choice for teams whose primary workflow is prompt experimentation and offline evaluation. For production observability of deployed agents — real-time tracing, cost attribution, guardrails, incident management, and a conversational copilot — Zespan covers the full picture that BrainTrust does not.

Capability comparison: Zespan vs BrainTrust
Capability	Zespan	BrainTrust
Production tracing	✓ Full span waterfall	Limited
Agent auto-discovery	✓ Zero config	Not available
Agent health scoring	✓ A–F composite	Not available
Multi-agent delegation	Yes	Not available
Evaluations	✓ Auto + manual, 12 templates	✓ Strong (core feature)
Dataset management	✓ Trace-to-dataset	✓ Strong (core feature)
Guardrails	✓ 7 types, pre/post	Not available
Cost management	✓ Attribution + forecast + recs	Not available
Incident management	✓ Full lifecycle	Not available
Natural language query	✓ ZespanPilot NLQ	Not available
Alerts	✓ Error, latency, cost, eval	Not available

Pick Zespan when…

You're running agents in production and need real-time tracing, not just offline evals.
You need guardrails, runtime safety, or cost controls in production.
You want incident management and alerting alongside quality evaluations.
You want one platform for production observability and evaluation — not two tools.
You need agent monitoring: health scoring, delegation graphs, per-agent cost attribution.

Pick BrainTrust when…

Your primary workflow is offline prompt experimentation and dataset curation.
You don't yet have agents in production — you're in the experimentation phase.
You want the most opinionated eval-first workflow and BrainTrust's scoring UX.

Frequently asked

Does Zespan have strong evaluations like BrainTrust?

Yes. Zespan auto-evaluators run on every new trace with 12 built-in LLM-as-judge templates (correctness, faithfulness, toxicity, relevance, and more). Manual eval runs against datasets are also supported. Eval scores are queryable via ZespanPilot.

Can I do dataset management in Zespan?

Yes. Zespan supports named datasets with up to 500 items per call. You can create datasets from production traces with one click (trace-to-dataset), run batch simulations against them, and browse full run history.

What does Zespan offer that BrainTrust doesn't?

Production tracing with span waterfall, multi-agent delegation graphs, agent health scoring, guardrails (PII, toxicity, format, cost ceiling), cost forecasting, AI cost optimizer, incident management, alerts (eval + cost + latency), and ZespanPilot NLQ — all missing from BrainTrust's current feature set.

Try Zespan — free, no card needed

10K traces/month free. Setup takes under 5 minutes. See why teams switch from BrainTrust.

Get started free →