Does it require an ElevenLabs API key?

Yes. You need an active ElevenLabs account with API access. The skill manages the API calls and handles audio chunking for longer files.

What's the maximum audio length it can handle?

There's no hard limit from the skill side. Long recordings are automatically split into chunks for processing and reassembled into a complete transcript. Processing time scales linearly with audio length.

Home/Skills/Media & Creative/ElevenLabs Speech-to-Text

ElevenLabs Speech-to-Text

clawdbotborges·Jan 26, 2026

Media & Creative

3.5k5

Summary

TL;DR: Transcribes audio files to text using ElevenLabs' speech recognition, with support for speaker detection and multiple languages.

ElevenLabs Speech-to-Text is an OpenClaw skill that transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

Created by clawdbotborges, this skill has been downloaded 3k+ times on ClawHub. Install it with one command and your AI agent gains these capabilities right away.

Use cases

Transcribe meeting recordings and extract action items from the text
Convert podcast episodes to text for show notes, blog posts, or search indexing
Process customer support calls into searchable transcripts for quality review
Create subtitles or captions for video content from the audio track

Installation

Run this command to install the skill on your OpenClaw agent:

Install with OpenClaw

npx clawhub@latest install elevenlabs-stt

Downloads

3.5k

Active installs

Stars

Updated

Jan 26, 2026

Security scan

VirusTotalBenign

View report

OpenClawBenignhigh confidence

This is a straightforward ElevenLabs transcription skill, but users should understand that selected audio files are sent to ElevenLabs for processing.

Purpose & Capability

Instruction Scope

Install Mechanism

Credentials

Persistence & Privilege

SKILL.md

---
name: elevenlabs-stt
description: Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
homepage: https://elevenlabs.io/speech-to-text
metadata: {"clawdbot":{"emoji":"🎙️","requires":{"bins":["curl"],"env":["ELEVENLABS_API_KEY"]},"primaryEnv":"ELEVENLABS_API_KEY"}}
---

# ElevenLabs Speech-to-Text

Transcribe audio files using ElevenLabs' Scribe v2 model. Supports 90+ languages with speaker diarization.

## Quick Start

```bash
# Basic transcription
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3

# With speaker diarization
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --diarize

# Specify language (improves accuracy)
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --lang en

# Full JSON output with timestamps
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --json
```

## Options

| Flag | Description |
|------|-------------|
| `--diarize` | Identify different speakers |
| `--lang CODE` | ISO language code (e.g., en, pt, es) |
| `--json` | Output full JSON with word timestamps |
| `--events` | Tag audio events (laughter, music, etc.) |

## Supported Formats

All major audio/video formats: mp3, m4a, wav, ogg, webm, mp4, etc.

## API Key

Set `ELEVENLABS_API_KEY` environment variable, or configure in clawdbot.json:

```json5
{
  skills: {
    entries: {
      "elevenlabs-stt": {
        apiKey: "sk_..."
      }
    }
  }
}
```

## Examples

```bash
# Transcribe a WhatsApp voice note
{baseDir}/scripts/transcribe.sh ~/Downloads/voice_note.ogg

# Meeting recording with multiple speakers
{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize --lang en

# Get JSON for processing
{baseDir}/scripts/transcribe.sh podcast.mp3 --json > transcript.json
```

Version history

v1.0.0Latest

Jan 26, 2026

- Initial release of elevenlabs-stt skill. - Transcribe audio files using ElevenLabs Scribe v2 model. - Supports 90+ languages and speaker diarization. - Multiple output options: plain text, JSON with timestamps, audio event tagging. - Works with all major audio and video formats. - Requires ELEVENLABS_API_KEY for authentication.

Frequently asked questions

Accuracy depends on audio quality, accents, and background noise. For clear recordings, you can expect high accuracy. The skill handles accented English and multiple languages well, though very noisy environments reduce quality.

Installation method

Send this prompt to your agent to install the skill

npx clawhub@latest install elevenlabs-stt

Download ZIP

Skill info

Versionv1.0.0

Authorclawdbotborges

CategoryMedia & Creative

UpdatedJan 26, 2026

Files

SKILL.md1.7 KB

Run OpenClaw in the cloud

Deploy in seconds. Skills pre-installed.

See plans

Skill data sourced from ClawHub