OpenTelemetry on day one: every step is a span
If you can't explain what an agent did six hours after it did it, you can't operate it. Tutti emits OTEL spans for every run, LLM call, and tool invocation — and you wire it into whatever you already use.
An agent system without observability is a system you can run, not a system you can operate. The two are not the same.
Running it means watching the demo work. Operating it means: a customer says "your bot deleted my issue at 14:30 yesterday and I want to know why." You need to find the run, find the span, find the prompt, find the tool call, find the model decision — in under a minute, six hours later. That's a different problem.
Tutti is built to be operable from day one. Every run, every LLM call, every tool invocation, every routing decision, every interrupt is an OpenTelemetry span. There's no "enable observability" toggle. Spans are how the runtime works internally; you just point them somewhere.
What gets traced
The `@tuttiai/telemetry` package wraps the agent loop in a parent span and adds child spans for:
- `agent.run` — the top-level call. Has the agent name, the input, the session ID, the model used. - `llm.call` — every Anthropic / OpenAI / Gemini round-trip. Has the model, the input/output token counts, the cost estimate, whether the call hit cache. - `tool.call` — every voice tool invocation. Has the voice name, tool name, sanitised input, result size, error flag. - `router.decision` — when `SmartProvider` is used, every routing decision. Has the input classifier signal, the tier chosen, the model selected, the cost ceiling. - `interrupt.requested` — every HITL pause. Has the tool, the arguments, who approved, how long it waited.
That's already enough to answer "why did the agent do X?" without asking the agent.
How you wire it up
OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io/v1/traces \
OTEL_EXPORTER_OTLP_HEADERS=x-honeycomb-team=YOUR_KEY \
tutti-ai runThat's the entire setup for Honeycomb. Same shape for Tempo, for Datadog, for Jaeger, for any OTLP collector. Tutti uses the OpenTelemetry SDK directly — no Tutti-specific dashboard, no Tutti-hosted ingestion, no proprietary format. Your existing observability stack gets the data; your existing alerting fires on it.
If you don't have an OTEL collector wired up yet, `@tuttiai/telemetry` ships with a `JsonFileExporter` that writes spans to a local JSON file you can grep. It's not pretty but it's enough for development.
Why "OTEL or nothing" is right
A dependence on a Tutti-hosted observability product would mean: vendor lock-in, latency in the data path, an extra bill, and a blast radius on Tutti's uptime. We chose to stand on OpenTelemetry instead. It's the standard. Every observability vendor speaks it. Your team probably already runs an OTEL collector for everything else. There's no reason for an agent framework to invent its own.
What you get from this
Three things, immediately:
1. Replay. Use `tutti-ai replay
You can ship without OTEL. You won't operate without OTEL. Wiring it up before you have an incident is significantly cheaper than wiring it up during one.