Peter Steinberger

Sag

Peter Steinberger·Jan 4, 2026

Summary

TL;DR: ElevenLabs text-to-speech with mac-style say UX.

Sag turns text into speech using ElevenLabs' voice synthesis API. It works like the macOS say command but with high-quality AI voices instead of the built-in system voices.

Just pass text to your agent and it speaks it out loud. The interface is simple on purpose: type text in, hear audio out. No complicated settings to configure for basic use.

This is handy when you want your agent to read something aloud. For speech-to-text, check the ElevenLabs STT skill, whether that is a summary, a notification, or just a long block of text you would rather listen to than read.

Use cases

  • Having your AI agent read long text summaries aloud while you multitask
  • Generating spoken audio from written content for accessibility
  • Creating quick voice memos or audio snippets from text
  • Adding text-to-speech output to AI agent workflows

Installation

Run this command to install the skill on your OpenClaw agent:

Install with OpenClaw
npx clawhub@latest install sag
Downloads
18.4k
Active installs
964
Stars
19
Updated
Jan 4, 2026

Security scan

Security scan
VirusTotalBenign
View report
OpenClawBenignmedium confidence

This is a coherent ElevenLabs text-to-speech skill, but it requires trusting a Homebrew-installed CLI and using an ElevenLabs API key.

Purpose & Capability
Instruction Scope
Install Mechanism
Credentials
Persistence & Privilege

SKILL.md

SKILL.md
---
name: sag
description: ElevenLabs text-to-speech with mac-style say UX.
homepage: https://sag.sh
metadata: {"clawdbot":{"emoji":"🗣️","requires":{"bins":["sag"],"env":["ELEVENLABS_API_KEY"]},"primaryEnv":"ELEVENLABS_API_KEY","install":[{"id":"brew","kind":"brew","formula":"steipete/tap/sag","bins":["sag"],"label":"Install sag (brew)"}]}}
---

# sag

Use `sag` for ElevenLabs TTS with local playback.

API key (required)
- `ELEVENLABS_API_KEY` (preferred)
- `SAG_API_KEY` also supported by the CLI

Quick start
- `sag "Hello there"`
- `sag speak -v "Roger" "Hello"`
- `sag voices`
- `sag prompting` (model-specific tips)

Model notes
- Default: `eleven_v3` (expressive)
- Stable: `eleven_multilingual_v2`
- Fast: `eleven_flash_v2_5`

Pronunciation + delivery rules
- First fix: respell (e.g. "key-note"), add hyphens, adjust casing.
- Numbers/units/URLs: `--normalize auto` (or `off` if it harms names).
- Language bias: `--lang en|de|fr|...` to guide normalization.
- v3: SSML `<break>` not supported; use `[pause]`, `[short pause]`, `[long pause]`.
- v2/v2.5: SSML `<break time="1.5s" />` supported; `<phoneme>` not exposed in `sag`.

v3 audio tags (put at the entrance of a line)
- `[whispers]`, `[shouts]`, `[sings]`
- `[laughs]`, `[starts laughing]`, `[sighs]`, `[exhales]`
- `[sarcastic]`, `[curious]`, `[excited]`, `[crying]`, `[mischievously]`
- Example: `sag "[whispers] keep this quiet. [short pause] ok?"`

Voice defaults
- `ELEVENLABS_VOICE_ID` or `SAG_VOICE_ID`

Confirm voice + speaker before long output.

## Chat voice responses

When Peter asks for a "voice" reply (e.g., "crazy scientist voice", "explain in voice"), generate audio and send it:

```bash
# Generate audio file
sag -v Clawd -o /tmp/voice-reply.mp3 "Your message here"

# Then include in reply:
# MEDIA:/tmp/voice-reply.mp3
```

Voice character tips:
- Crazy scientist: Use `[excited]` tags, dramatic pauses `[short pause]`, vary intensity
- Calm: Use `[whispers]` or slower pacing
- Dramatic: Use `[sings]` or `[shouts]` sparingly

Default voice for Clawd: `lj2rcrvANS3gaWWnczSX` (or just `-v Clawd`)

Version history

v1.0.0Latest
Jan 4, 2026

Frequently asked questions

Yes. You need an ElevenLabs account and API key. ElevenLabs offers a free tier with limited characters per month, which is enough for casual use.

Installation method

Send this prompt to your agent to install the skill

npx clawhub@latest install sag
Download ZIP

Skill info

Versionv1.0.0
AuthorPeter Steinberger
UpdatedJan 4, 2026

Files

SKILL.md2.0 KB

Run OpenClaw in the cloud

Deploy in seconds. Skills pre-installed.

See plans

Skill data sourced from ClawHub