Zespan is an AI agent observability and engineering platform. It traces every agent decision, tool call, handoff, and delegation in production. It also provides prompt versioning, built-in LLM-as-judge evaluations, guardrails, cost optimization, and an AI ops assistant called ZespanPilot.

How do I instrument my AI agent with Zespan?

Zespan requires 2 lines of code. Import zespan and call zespan.init({ apiKey: process.env.LT_KEY }). This auto-patches OpenAI, Anthropic, Gemini, Bedrock, and Mistral. For framework-level tracing, add one handler: ZespanCallbackHandler for LangChain, ZespanCrewAIListener for CrewAI, or ZespanADKHandler for Google ADK.

Does Zespan support prompt versioning?

Yes. Zespan includes prompt management with versioning, a playground for iteration, and A/B testing to compare prompt versions against each other in production.

What evaluations does Zespan support?

Zespan ships 12 built-in LLM-as-judge evaluation templates including faithfulness, relevance, toxicity, groundedness, and more. Evaluations run automatically on every trace with no custom scoring functions required.

How does Zespan compare to Langfuse?

Zespan is agent-native: every span carries agent identity, delegations are first-class trace events, and an agent map is built automatically. Langfuse was built for LLM pipelines and extended to agents later. Zespan also ships 12 built-in eval templates (Langfuse has none), includes an AI cost optimizer, and ZespanPilot for AI ops. Langfuse has open-source self-hosting; Zespan does not.

What is the free tier for Zespan?

The free tier includes 10,000 traces per month, 14-day retention, 2 projects, and 1 seat. No credit card required.

Feature — Cost Management

Cut your LLM bill before it cuts your runway.

Attribution across 5 dimensions, 30-day forecasting with confidence bands, and AI-generated optimization recommendations with savings estimates.

Works automatically — no extra setup beyond wrapping your LLM clients.

Start for free →Get a demo

zespan.com — cost management

Works withAgent attributionModel attributionUser attributionCost forecastingAI recommendationsQuota enforcement

30 days

forecast horizon

attribution dimensions

Top 200

per dimension

5-Dimension Attribution

Slice LLM spend by agent, tool, model, user, or operation. For each dimension: total cost, call count, cost per call, average latency, input/output tokens, and error rate. Switch between 24h, 7d, 30d, and 90d windows.

Dimensions: agent, tool, model, user, operation
Top 200 entries per dimension for any time range
Time ranges: 24h, 7d, 30d, 90d

5-dimension attribution

Zespan cost attribution view showing spend breakdown by agent and model

30-Day Forecasting

OLS linear regression over daily cost history projects spend 30 days forward with low/high confidence bands. Trend classification (increasing/decreasing/stable) and seasonality detection (day-of-week patterns) surface early warnings.

Linear regression over daily cost history
Confidence band: low and high projection
Seasonality detection: per-day-of-week average for scheduling insights

30-day forecasting

Zespan 30-day cost forecast with confidence band and trend classification

AI Cost Recommendations

Zespan analyzes your trace data and generates ranked, actionable recommendations — each with projected monthly USD savings, confidence level, effort rating, and a copy-paste code snippet to implement it.

Types: model switch, prompt compression, batching, caching
Projected savings: monthly USD estimate per recommendation
Code snippet included: copy-paste to implement

Anomaly Detection & Quota

AI anomaly detection flags unusual cost spikes in real time — catching them the day they happen, not the day the invoice arrives. Per-org monthly quotas enforce spending limits with soft mode (overage allowed up to 5×) or hard mode (strict 429).

Background anomaly detection: flags cost/latency/error spikes automatically
Soft mode: allows overage up to capMultiplier × limit (default 5×)
Hard mode: strict cap — free plan always runs hard mode

anomaly detection & quota

Zespan cost explorer showing daily spend trend with anomaly detection

Get started

Set up in under 5 minutes

typescriptCost Management

// No extra setup — cost tracking is automatic when you wrap your LLM clients.
import { Zespan, wrapOpenAI, wrapAnthropic } from '@zespan/sdk';

const lt = new Zespan({ apiKey: process.env.ZESPAN_API_KEY });

// Every call now attributed by agent, model, user, and operation
const openai = wrapOpenAI(new OpenAI(), lt);
const anthropic = wrapAnthropic(new Anthropic(), lt);

Start for free →Get a demo

Frequently asked

How does Zespan know the cost of each LLM call?

The SDK reads token counts from the LLM provider's API response and looks up current pricing for that model. Cost per call is calculated as (input_tokens × input_price) + (output_tokens × output_price). Pricing tables are updated when providers change their rates.

Can I attribute cost to specific product features, not just agents?

Yes. Pass an operation tag in the metadata when making LLM calls — e.g., operation: 'summarize-ticket' or operation: 'generate-reply'. The cost attribution view lets you slice by operation to see which product features drive the most spend.

What's the difference between soft mode and hard mode quota enforcement?

Hard mode strictly caps ingest at the monthly quota limit — any request over quota returns 429. Soft mode allows overage up to a configurable cap multiplier (default 5×) before hard blocking, and overage is billed. Free plans always use hard mode with a 1× multiplier.

How actionable are the AI recommendations?

Each recommendation names the specific agent or operation, explains why the change will help (with supporting evidence from your trace data), gives a projected monthly USD savings figure, and includes a copy-paste code snippet. You can mark them applied, acknowledged, or snooze them.

Explore more features

Setup takes under 5 minutes. Works with OpenAI, Anthropic, LangChain, and more.

Get started free →Get a demo

← All features

Cut your LLM bill before it cuts your runway.

5-Dimension Attribution

30-Day Forecasting

AI Cost Recommendations

Anomaly Detection & Quota

How does Zespan know the cost of each LLM call?

Can I attribute cost to specific product features, not just agents?

What's the difference between soft mode and hard mode quota enforcement?

How actionable are the AI recommendations?

Tracing

Agent Monitoring

Evaluations

Guardrails