WhatsApp Voice
@tuttiai/whatsapp — send and receive WhatsApp messages via Meta's official Cloud API
The WhatsApp voice gives agents the ability to send WhatsApp messages via Meta’s official Cloud API, and powers the inbound webhook for @tuttiai/inbox’s WhatsApp adapter.
Two tools ship, both destructive: true:
send_text_message— free-form text. Only valid within the 24h customer-service window.send_template_message— pre-approved Message Templates. Required for re-engagement outside the 24h window.
Why Cloud API and not whatsapp-web.js?
This voice uses Meta’s official Cloud API (Graph API v21+). It does NOT use whatsapp-web.js or any unofficial WhatsApp Web automation:
- ToS — automating WhatsApp Web violates WhatsApp’s terms; accounts can be banned.
- Stability — unofficial libraries break on every WhatsApp update; the Cloud API is contracted.
- Auth — Cloud API uses System User access tokens; no QR codes, no phone-pairing.
- Reliability — webhooks are delivered with retries; web-automation pipelines silently drop on disconnect.
Installation
npx tutti-ai add whatsapp
Required permissions
permissions: ["network"]
Required environment variables
| Var | Description |
|---|---|
WHATSAPP_ACCESS_TOKEN | Permanent System User access token. Generate in Meta Business → System Users with whatsapp_business_messaging + whatsapp_business_management scopes. |
WHATSAPP_VERIFY_TOKEN | Any random string (e.g. openssl rand -hex 32). Configure both this env var AND the matching value in Meta App → WhatsApp → Configuration → Verify token. |
WHATSAPP_APP_SECRET | Meta App → Settings → Basic → App Secret. Used to verify HMAC-SHA256 signatures on every inbound webhook. |
The phoneNumberId is NOT a secret — it’s an opaque identifier visible in the Meta App dashboard. It stays in the score.
Score example
import { WhatsAppVoice } from "@tuttiai/whatsapp";
import { defineScore } from "@tuttiai/core";
export default defineScore({
agents: {
support: {
name: "support",
system_prompt: "You are a WhatsApp support agent.",
voices: [new WhatsAppVoice({ phoneNumberId: "1234567890" })],
permissions: ["network"],
},
},
});
The webhook tunnel — main UX wart
The Cloud API requires Meta to POST inbound messages to a public HTTPS endpoint. The voice spins up a Fastify server on port 3848 (configurable) hosting GET /webhook (verify) and POST /webhook (inbound). You need to expose that port to the internet:
# Cloudflare Tunnel — recommended for production
cloudflared tunnel --url http://localhost:3848
# ngrok — fine for dev
ngrok http 3848
# Or run a proper reverse proxy (nginx, Caddy) in front of port 3848
Then in Meta App → WhatsApp → Configuration:
- Callback URL:
https://<your-tunnel>/webhook - Verify token: the value you set in
WHATSAPP_VERIFY_TOKEN - Subscribe to the
messageswebhook field (delivery-status events are silently ignored).
Send a test WhatsApp message from your personal phone to the configured business number — it should arrive in your agent within seconds.
The 24-hour customer-service window
WhatsApp’s most surprising rule: outside of 24 hours since the user’s last inbound message, you can only send pre-approved Message Templates, not free-form text. The Cloud API rejects free-form messages outside this window with error 131047.
send_text_message surfaces 131047 with a clear hint pointing at send_template_message. Templates have to be registered + approved in Meta App → WhatsApp → Message Templates before they can be sent.
Limitations in v0.25
- Group chats — not supported by the Cloud API for two-way bots; Meta does not deliver group messages over webhooks. Direct messages only.
- Outbound media — text only in v0.25. Inbound media (image / audio / video / document) is surfaced as
[image]/[image] caption/ etc. with the resolved Cloud API URL on the inbox message’srawobject. Adding a typedattachmentsfield to the canonicalInboxMessageshape is a deliberate cross-platform design pass deferred to v0.26. - Polling fallback — there isn’t one. Webhooks are mandatory; the tunnel requirement is real.
Inbound (inbox)
The WhatsAppClientWrapper exposes subscribeMessage(handler) so @tuttiai/inbox’s WhatsApp adapter consumes the same Fastify server (one webhook listener regardless of whether the voice + adapter are both active). Sharing happens via WhatsAppClientWrapper.forKey(phoneNumberId, …) — keyed by the bot identity.
Defence-in-depth on every POST:
| Surface | Default |
|---|---|
| HMAC-SHA256 signature check | Mandatory. Rejected (401) without the correct X-Hub-Signature-256 header. Constant-time comparison (crypto.timingSafeEqual) — no length-leak timing attack. |
| 200 ack BEFORE handler dispatch | Mandatory. Meta retries non-2xx within ~20s. |
Inbound text redaction (SecretsManager.redact) | On by default. Opt out via redactRawText: false. |
| Body size limit | 5 MB. Configurable via bodyLimit. |
Lifecycle
The wrapper builds the Fastify instance immediately on construction (so tests can use app.inject(...) without binding to a port). The actual listen() call happens in launch() — called by the inbox adapter’s start(). destroy() closes the Fastify server and clears subscribers.