Feature — Cost Management

Cut your LLM bill before it cuts your runway.

Attribution across 5 dimensions, 30-day forecasting with confidence bands, and AI-generated optimization recommendations with savings estimates.

Works automatically — no extra setup beyond wrapping your LLM clients.

zespan.com — cost management
Zespan Cost Management
Works withAgent attributionModel attributionUser attributionCost forecastingAI recommendationsQuota enforcement

30 days

forecast horizon

5

attribution dimensions

Top 200

per dimension

5-Dimension Attribution

Slice LLM spend by agent, tool, model, user, or operation. For each dimension: total cost, call count, cost per call, average latency, input/output tokens, and error rate. Switch between 24h, 7d, 30d, and 90d windows.

  • Dimensions: agent, tool, model, user, operation
  • Top 200 entries per dimension for any time range
  • Time ranges: 24h, 7d, 30d, 90d
5-dimension attribution
Zespan cost attribution view showing spend breakdown by agent and model

30-Day Forecasting

OLS linear regression over daily cost history projects spend 30 days forward with low/high confidence bands. Trend classification (increasing/decreasing/stable) and seasonality detection (day-of-week patterns) surface early warnings.

  • Linear regression over daily cost history
  • Confidence band: low and high projection
  • Seasonality detection: per-day-of-week average for scheduling insights
30-day forecasting
Zespan 30-day cost forecast with confidence band and trend classification

AI Cost Recommendations

Zespan analyzes your trace data and generates ranked, actionable recommendations — each with projected monthly USD savings, confidence level, effort rating, and a copy-paste code snippet to implement it.

  • Types: model switch, prompt compression, batching, caching
  • Projected savings: monthly USD estimate per recommendation
  • Code snippet included: copy-paste to implement

Anomaly Detection & Quota

AI anomaly detection flags unusual cost spikes in real time — catching them the day they happen, not the day the invoice arrives. Per-org monthly quotas enforce spending limits with soft mode (overage allowed up to 5×) or hard mode (strict 429).

  • Background anomaly detection: flags cost/latency/error spikes automatically
  • Soft mode: allows overage up to capMultiplier × limit (default 5×)
  • Hard mode: strict cap — free plan always runs hard mode
anomaly detection & quota
Zespan cost explorer showing daily spend trend with anomaly detection

Get started

Set up in under 5 minutes

typescriptCost Management
// No extra setup — cost tracking is automatic when you wrap your LLM clients.
import { Zespan, wrapOpenAI, wrapAnthropic } from '@zespan/sdk';

const lt = new Zespan({ apiKey: process.env.ZESPAN_API_KEY });

// Every call now attributed by agent, model, user, and operation
const openai = wrapOpenAI(new OpenAI(), lt);
const anthropic = wrapAnthropic(new Anthropic(), lt);

Frequently asked

How does Zespan know the cost of each LLM call?

The SDK reads token counts from the LLM provider's API response and looks up current pricing for that model. Cost per call is calculated as (input_tokens × input_price) + (output_tokens × output_price). Pricing tables are updated when providers change their rates.

Can I attribute cost to specific product features, not just agents?

Yes. Pass an operation tag in the metadata when making LLM calls — e.g., operation: 'summarize-ticket' or operation: 'generate-reply'. The cost attribution view lets you slice by operation to see which product features drive the most spend.

What's the difference between soft mode and hard mode quota enforcement?

Hard mode strictly caps ingest at the monthly quota limit — any request over quota returns 429. Soft mode allows overage up to a configurable cap multiplier (default 5×) before hard blocking, and overage is billed. Free plans always use hard mode with a 1× multiplier.

How actionable are the AI recommendations?

Each recommendation names the specific agent or operation, explains why the change will help (with supporting evidence from your trace data), gives a projected monthly USD savings figure, and includes a copy-paste code snippet. You can mark them applied, acknowledged, or snooze them.

Start free — 10K traces/month, no card needed

Setup takes under 5 minutes. Works with OpenAI, Anthropic, LangChain, and more.