Voices: the plugin model agent frameworks need

Most frameworks treat tools as a flat list. Every team rebuilds the same GitHub wrapper, the same Slack wrapper, the same Stripe wrapper. Voices are how Tutti stops that.

Chihab
Building Tutti AI · · 6 min read

Open the docs for almost any agent framework and look at how tools are introduced. You'll see something like: "define a function, write a docstring, register it with the agent." The framework gives you a registration helper and gets out of the way.

That works for one tool. It does not work for an ecosystem.

What you actually want, three months in, is to install `@something/github` from npm and have ten typed tools, with sensible permission scopes, with rate-limit awareness, with HITL gating on the destructive ones, just appear. You want the same shape for Slack, Postgres, Stripe, and the in-house thing nobody else uses. You want versioned, semver-respecting upgrades. You want to ship your own.

In Tutti, that unit is called a voice.

The Voice and Tool interfaces

A voice is a TypeScript object that implements two interfaces from `@tuttiai/types`:

```ts export interface Tool { name: string description: string parameters: ZodType execute(input: T, context: ToolContext): Promise destructive?: boolean }

export interface Voice { name: string description?: string tools: Tool[] required_permissions: Permission[] setup?(context: VoiceContext): Promise teardown?(): Promise } ```

That's the whole contract. The runtime doesn't care whether your voice is a class, a factory function, or a literal — as long as it conforms.

Why this shape

Every field is there because it earned its place:

- `parameters: ZodType` — the LLM sees a JSON Schema derived from the Zod type. The runtime validates inputs with the same Zod type. No drift between "what the model thinks" and "what the function accepts." - `required_permissions` — the runtime calls `PermissionGuard.check()` before loading the voice. If the agent didn't grant it, the voice is rejected at startup, not at the first failing tool call. - `destructive` — flips HITL gating on. The runtime pauses for operator approval before `execute` runs. No app-side wiring needed. - `setup` / `teardown` — opens connections, logs in, warms caches at startup; closes them on shutdown. The runtime guarantees `teardown` runs.

What a voice looks like end-to-end

```ts import type { Voice, Tool } from '@tuttiai/types' import { z } from 'zod'

const get_forecast: Tool = { name: 'get_forecast', description: 'Forecast for a city.', parameters: z.object({ city: z.string() }), execute: async ({ city }) => { const r = await fetch(`https://api.weather.com/${city}`) return { content: await r.text() } }, }

export const WeatherVoice: Voice = { name: 'weather', required_permissions: ['network'], tools: [get_forecast], } ```

That's it. Publish to npm under `@yourorg/weather` and any Tutti score file can install it and grant it network access. No framework registration ceremony. No globals.

Why typed plugins beat string-based ones

Frameworks that treat tools as strings on a list cannot give you autocomplete, can't catch a typo at compile time, can't surface that two voices both want a `search` tool until they collide at runtime. Tutti voices are first-class TypeScript imports. Your IDE knows what they expose. The compiler tells you when an agent references a voice that doesn't exist.

Why this matters for the ecosystem

Twelve voices ship today: filesystem, github, playwright, web, sandbox, rag, discord, slack, twitter, postgres, stripe, mcp. The MCP voice means anything with an MCP server is reachable in one install. And anyone — you, me, a contributor — can write a voice in twenty lines and publish it. The plugin ecosystem is the framework.

Tags #voices #plugins #design
Older post
Configuration over code: why score files beat Python graphs
7 min · Engineering
Newer post
Permission scopes the runtime actually enforces
5 min · Engineering

Start conducting.

One install. Your first agent running in 60 seconds. No signup. No telemetry.