Feature — Cost Management
Cut your LLM bill before it cuts your runway.
Attribution across 5 dimensions, 30-day forecasting with confidence bands, and AI-generated optimization recommendations with savings estimates.
Works automatically — no extra setup beyond wrapping your LLM clients.

30 days
forecast horizon
5
attribution dimensions
Top 200
per dimension
5-Dimension Attribution
Slice LLM spend by agent, tool, model, user, or operation. For each dimension: total cost, call count, cost per call, average latency, input/output tokens, and error rate. Switch between 24h, 7d, 30d, and 90d windows.
- Dimensions: agent, tool, model, user, operation
- Top 200 entries per dimension for any time range
- Time ranges: 24h, 7d, 30d, 90d

30-Day Forecasting
OLS linear regression over daily cost history projects spend 30 days forward with low/high confidence bands. Trend classification (increasing/decreasing/stable) and seasonality detection (day-of-week patterns) surface early warnings.
- Linear regression over daily cost history
- Confidence band: low and high projection
- Seasonality detection: per-day-of-week average for scheduling insights

AI Cost Recommendations
Zespan analyzes your trace data and generates ranked, actionable recommendations — each with projected monthly USD savings, confidence level, effort rating, and a copy-paste code snippet to implement it.
- Types: model switch, prompt compression, batching, caching
- Projected savings: monthly USD estimate per recommendation
- Code snippet included: copy-paste to implement
Anomaly Detection & Quota
AI anomaly detection flags unusual cost spikes in real time — catching them the day they happen, not the day the invoice arrives. Per-org monthly quotas enforce spending limits with soft mode (overage allowed up to 5×) or hard mode (strict 429).
- Background anomaly detection: flags cost/latency/error spikes automatically
- Soft mode: allows overage up to capMultiplier × limit (default 5×)
- Hard mode: strict cap — free plan always runs hard mode

Get started
Set up in under 5 minutes
// No extra setup — cost tracking is automatic when you wrap your LLM clients.
import { Zespan, wrapOpenAI, wrapAnthropic } from '@zespan/sdk';
const lt = new Zespan({ apiKey: process.env.ZESPAN_API_KEY });
// Every call now attributed by agent, model, user, and operation
const openai = wrapOpenAI(new OpenAI(), lt);
const anthropic = wrapAnthropic(new Anthropic(), lt);Frequently asked
How does Zespan know the cost of each LLM call?
The SDK reads token counts from the LLM provider's API response and looks up current pricing for that model. Cost per call is calculated as (input_tokens × input_price) + (output_tokens × output_price). Pricing tables are updated when providers change their rates.
Can I attribute cost to specific product features, not just agents?
Yes. Pass an operation tag in the metadata when making LLM calls — e.g., operation: 'summarize-ticket' or operation: 'generate-reply'. The cost attribution view lets you slice by operation to see which product features drive the most spend.
What's the difference between soft mode and hard mode quota enforcement?
Hard mode strictly caps ingest at the monthly quota limit — any request over quota returns 429. Soft mode allows overage up to a configurable cap multiplier (default 5×) before hard blocking, and overage is billed. Free plans always use hard mode with a 1× multiplier.
How actionable are the AI recommendations?
Each recommendation names the specific agent or operation, explains why the change will help (with supporting evidence from your trace data), gives a projected monthly USD savings figure, and includes a copy-paste code snippet. You can mark them applied, acknowledged, or snooze them.
Start free — 10K traces/month, no card needed
Setup takes under 5 minutes. Works with OpenAI, Anthropic, LangChain, and more.