Groq

Groq Integration

Drop-in wrapper for the Groq SDK. Every Llama, Mixtral, and Gemma inference call is traced with accurate sub-second latency measurement.

Groq provides ultra-low-latency inference for open models (Llama 3, Mixtral, Gemma, Whisper) via a dedicated LPU inference engine.

zespan.com — groq trace
Zespan performance view showing Groq sub-second latency distribution with P50/P95/P99

Getting Started

1

Install the SDK

Install @zespan/sdk alongside groq-sdk.

bash
npm install @zespan/sdk groq-sdk
2

Wrap the Groq client

Pass your Groq client to wrapGroq(). All chat completion calls are traced with accurate latency measurement.

typescript
import Groq from 'groq-sdk';
import { Zespan, wrapGroq } from '@zespan/sdk';

const lt = new Zespan({ apiKey: process.env.ZESPAN_API_KEY });
const groq = wrapGroq(new Groq(), lt);

const completion = await groq.chat.completions.create({
  model: 'llama-3.3-70b-versatile',
  messages: [{ role: 'user', content: 'Hello' }],
});
3

View traces in Zespan

Open Trace Explorer. Groq calls appear with model, token usage, cost, and latency. The Performance view shows Groq's sub-second latency distribution.

What's captured automatically

  • Accurate sub-second latency: time-to-first-token and total latency measured precisely
  • All Groq models: Llama 3, Mixtral, Gemma, Whisper, and future additions
  • Model comparison: compare cost and latency across Groq models in Cost Attribution
  • Rate limit tracking: 429 errors attributed with full context
  • Audio transcription: Whisper calls traced alongside chat completions

FAQ

Does Zespan capture Groq's fast time-to-first-token accurately?

Yes. The SDK measures time from request start to first token received, which is the meaningful latency number for Groq's streaming completions.

Can I compare Groq vs OpenAI latency in Zespan?

Yes. Filter the Performance view by provider and compare P50/P95/P99 latency across providers and models side by side.

Start for free — 10K traces/month, no card needed

Groq integration works on all plans including the free tier.

← All integrations