Voices: the plugin model agent frameworks need
Most frameworks treat tools as a flat list. Every team rebuilds the same GitHub wrapper, the same Slack wrapper, the same Stripe wrapper. Voices are how Tutti stops that.
Open the docs for almost any agent framework and look at how tools are introduced. You'll see something like: "define a function, write a docstring, register it with the agent." The framework gives you a registration helper and gets out of the way.
That works for one tool. It does not work for an ecosystem.
What you actually want, three months in, is to install `@something/github` from npm and have ten typed tools, with sensible permission scopes, with rate-limit awareness, with HITL gating on the destructive ones, just appear. You want the same shape for Slack, Postgres, Stripe, and the in-house thing nobody else uses. You want versioned, semver-respecting upgrades. You want to ship your own.
In Tutti, that unit is called a voice.
The Voice and Tool interfaces
A voice is a TypeScript object that implements two interfaces from `@tuttiai/types`:
```ts
export interface Tool
export interface Voice {
name: string
description?: string
tools: Tool[]
required_permissions: Permission[]
setup?(context: VoiceContext): Promise
That's the whole contract. The runtime doesn't care whether your voice is a class, a factory function, or a literal — as long as it conforms.
Why this shape
Every field is there because it earned its place:
- `parameters: ZodType` — the LLM sees a JSON Schema derived from the Zod type. The runtime validates inputs with the same Zod type. No drift between "what the model thinks" and "what the function accepts." - `required_permissions` — the runtime calls `PermissionGuard.check()` before loading the voice. If the agent didn't grant it, the voice is rejected at startup, not at the first failing tool call. - `destructive` — flips HITL gating on. The runtime pauses for operator approval before `execute` runs. No app-side wiring needed. - `setup` / `teardown` — opens connections, logs in, warms caches at startup; closes them on shutdown. The runtime guarantees `teardown` runs.
What a voice looks like end-to-end
```ts import type { Voice, Tool } from '@tuttiai/types' import { z } from 'zod'
const get_forecast: Tool = { name: 'get_forecast', description: 'Forecast for a city.', parameters: z.object({ city: z.string() }), execute: async ({ city }) => { const r = await fetch(`https://api.weather.com/${city}`) return { content: await r.text() } }, }
export const WeatherVoice: Voice = { name: 'weather', required_permissions: ['network'], tools: [get_forecast], } ```
That's it. Publish to npm under `@yourorg/weather` and any Tutti score file can install it and grant it network access. No framework registration ceremony. No globals.
Why typed plugins beat string-based ones
Frameworks that treat tools as strings on a list cannot give you autocomplete, can't catch a typo at compile time, can't surface that two voices both want a `search` tool until they collide at runtime. Tutti voices are first-class TypeScript imports. Your IDE knows what they expose. The compiler tells you when an agent references a voice that doesn't exist.
Why this matters for the ecosystem
Twelve voices ship today: filesystem, github, playwright, web, sandbox, rag, discord, slack, twitter, postgres, stripe, mcp. The MCP voice means anything with an MCP server is reachable in one install. And anyone — you, me, a contributor — can write a voice in twenty lines and publish it. The plugin ecosystem is the framework.