New — MCP server, CLI, and SDKs in one launch

AudioPod for Agents

Generate music, narrate audiobooks, clone voices, transcribe meetings, separate stems — from any AI agent, IDE, or terminal. One MCP endpoint. Two SDKs. One CLI. Auth that any modern agent already understands.

Connect your agent

Use from CLI

Building over HTTP instead? Explore the full developer API

$pip install audiopod && audiopod music "lo-fi rainy 90 BPM"

Send your AI agent to AudioPod

View skill.md

Works with Claude, GPT, Codex, Cursor, Continue, Cline, OpenClaw, Hermes, and any agent that can read a URL. Paste this into your agent and it onboards itself.

One-shot agent prompt

Read https://audiopod.ai/skill.md and follow the instructions to onboard yourself to AudioPod. After you finish, do whatever I ask next using AudioPod's tools.

1. Copy the line above.
2. Paste it into your AI agent's chat.
3. The agent fetches /skill.md and walks you through getting an API key + wiring up MCP, CLI, or SDK on its own.

Plug into any agent

AudioPod speaks Model Context Protocol over Streamable HTTP. Pick the one-line install for your CLI, or paste a snippet into a desktop client.

Recommended

One-line install

ships with Claude Code 2.x — adds the server to your user-level config

claude mcp add --transport http audiopod https://mcp.audiopod.ai \
  --header "X-API-Key: ap_YOUR_KEY" --scope user

Or paste a snippet into a desktop client

~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "audiopod": {
      "url": "https://mcp.audiopod.ai",
      "headers": {
        "X-API-Key": "ap_YOUR_KEY"
      }
    }
  }
}

Need an API key? Create one in your dashboard. Free tier credits unlock the full tool surface — no card required.

Audio capabilities, one endpoint

Each skill is documented at /.well-known/agent-skills/ and can be invoked over MCP, REST, the SDKs, or the CLI.

generate_music

Generate full songs, instrumentals, vocal stems, or rap tracks from a text prompt. Royalty-free.

Read skill

text_to_speech

100+ languages, 100+ voices, custom cloned voices. Stream or batch.

Catalog

clone_voice

5–30s reference clip → reusable voice with watermark + consent attestation.

Read skill

separate_stems

Vocals, drums, bass, guitar, piano, other — up to 16 stems. Perfect for remixes and karaoke.

Read skill

transcribe_audio

Word-level timestamps, speaker diarization, SRT/VTT/JSON output.

Read skill

denoise_audio

Remove hiss, hum, traffic, room tone — preserves voice character.

Try it

convert_media

Convert between MP3, WAV, FLAC, OGG, M4A, MP4, MOV. Adjust quality and bitrate.

Try it

narrate_audiobook

Convert a manuscript (PDF/EPUB/DOCX) into an ACX-compliant audiobook with chapter splits, retail sample, per-line free first regen, and multi-platform export (Audible, Spotify, Google Play, Kindle Vella).

Read skill

From the terminal

One CLI, ships with both SDKs. Same commands whether you pip install audiopod or npm i -g audiopod.

pip install audiopod

Auth: audiopod login stores your API key at ~/.audiopod/config.json (or read AUDIOPOD_API_KEY).

Output: human-readable by default; pass --json for scripting.

Async jobs: the CLI streams progress and writes the final file when complete.

# Authenticate once
audiopod login

# Generate music from a prompt
audiopod music "lo-fi rainy 90 BPM" --duration 60 --out song.wav

# Voiceover — pick any slug from the public voice catalog
audiopod tts "Welcome to AudioPod." --voice aurora-warm-friendly --out hello.wav

# Transcribe a meeting with speaker labels
audiopod transcribe meeting.mp3 --diarize --format srt > meeting.srt

# Split a song into stems
audiopod stems track.wav --mode six

# Clone a voice from a 30-second sample
audiopod clone reference.wav --name "Narrator"

# Poll any async job
audiopod jobs job_abc123

Two SDKs, identical surface

The Python and Node clients mirror each other call-for-call. Same shapes, same async ergonomics, same automatic credit reservations.

Pythonpip install audiopod

from audiopod import AudioPod

client = AudioPod()  # reads AUDIOPOD_API_KEY

# Generate a song
job = client.music.generate(
    prompt="lo-fi rainy 90 BPM",
    duration=60,
)
song = job.wait()
song.download("song.wav")

# Synthesize speech — pick any slug from the public voice catalog
speech = client.tts.synthesize(
    text="Welcome to AudioPod.",
    voice="aurora-warm-friendly",
)
speech.download("hello.wav")

TypeScriptnpm install audiopod

import AudioPod from "audiopod";

const client = new AudioPod(); // reads AUDIOPOD_API_KEY

// Generate a song
const job = await client.music.generate({
  prompt: "lo-fi rainy 90 BPM",
  duration: 60,
});
const song = await job.wait();
await song.download("song.wav");

// Synthesize speech — pick any slug from the public voice catalog
const speech = await client.tts.synthesize({
  text: "Welcome to AudioPod.",
  voice: "aurora-warm-friendly",
});
await speech.download("hello.wav");

Open standards everywhere

Every discovery surface an agent might check is published. No bespoke handshakes, no closed schemas.

skill.md (one-shot agent onboarding)moltbook-style MCP Server Cardmodelcontextprotocol.io OAuth Protected ResourceRFC 9728 OAuth Authorization ServerRFC 8414 OpenID DiscoveryOpenID Connect Discovery 1.0 A2A Agent Carda2a-protocol.org Agent Skillsagentskills.io API CatalogRFC 9727 ai.txtCrawler metadata llms.txt / llms-full.txtllmstxt.org

Ship your first agent-driven audio in 60 seconds

Free credits. No card. No vendor lock-in. Open standards from the first request.

Get an API key Read the docs

New — MCP server, CLI, and SDKs in one launch

AudioPod for Agents

Connect your agent

Use from CLI

Building over HTTP instead? Explore the full developer API

$pip install audiopod && audiopod music "lo-fi rainy 90 BPM"

Send your AI agent to AudioPod

View skill.md

Works with Claude, GPT, Codex, Cursor, Continue, Cline, OpenClaw, Hermes, and any agent that can read a URL. Paste this into your agent and it onboards itself.

One-shot agent prompt

Read https://audiopod.ai/skill.md and follow the instructions to onboard yourself to AudioPod. After you finish, do whatever I ask next using AudioPod's tools.

1. Copy the line above.
2. Paste it into your AI agent's chat.
3. The agent fetches /skill.md and walks you through getting an API key + wiring up MCP, CLI, or SDK on its own.

Plug into any agent

AudioPod speaks Model Context Protocol over Streamable HTTP. Pick the one-line install for your CLI, or paste a snippet into a desktop client.

Recommended

One-line install

ships with Claude Code 2.x — adds the server to your user-level config

claude mcp add --transport http audiopod https://mcp.audiopod.ai \
  --header "X-API-Key: ap_YOUR_KEY" --scope user

Or paste a snippet into a desktop client

~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "audiopod": {
      "url": "https://mcp.audiopod.ai",
      "headers": {
        "X-API-Key": "ap_YOUR_KEY"
      }
    }
  }
}

Need an API key? Create one in your dashboard. Free tier credits unlock the full tool surface — no card required.

Audio capabilities, one endpoint

Each skill is documented at /.well-known/agent-skills/ and can be invoked over MCP, REST, the SDKs, or the CLI.

generate_music

Generate full songs, instrumentals, vocal stems, or rap tracks from a text prompt. Royalty-free.

Read skill

text_to_speech

100+ languages, 100+ voices, custom cloned voices. Stream or batch.

Catalog

clone_voice

5–30s reference clip → reusable voice with watermark + consent attestation.

Read skill

separate_stems

Vocals, drums, bass, guitar, piano, other — up to 16 stems. Perfect for remixes and karaoke.

Read skill

transcribe_audio

Word-level timestamps, speaker diarization, SRT/VTT/JSON output.

Read skill

denoise_audio

Remove hiss, hum, traffic, room tone — preserves voice character.

Try it

convert_media

Convert between MP3, WAV, FLAC, OGG, M4A, MP4, MOV. Adjust quality and bitrate.

Try it

narrate_audiobook

Read skill

From the terminal

One CLI, ships with both SDKs. Same commands whether you pip install audiopod or npm i -g audiopod.

pip install audiopod

Auth: audiopod login stores your API key at ~/.audiopod/config.json (or read AUDIOPOD_API_KEY).

Output: human-readable by default; pass --json for scripting.

Async jobs: the CLI streams progress and writes the final file when complete.

# Authenticate once
audiopod login

# Generate music from a prompt
audiopod music "lo-fi rainy 90 BPM" --duration 60 --out song.wav

# Voiceover — pick any slug from the public voice catalog
audiopod tts "Welcome to AudioPod." --voice aurora-warm-friendly --out hello.wav

# Transcribe a meeting with speaker labels
audiopod transcribe meeting.mp3 --diarize --format srt > meeting.srt

# Split a song into stems
audiopod stems track.wav --mode six

# Clone a voice from a 30-second sample
audiopod clone reference.wav --name "Narrator"

# Poll any async job
audiopod jobs job_abc123

Two SDKs, identical surface

The Python and Node clients mirror each other call-for-call. Same shapes, same async ergonomics, same automatic credit reservations.

Pythonpip install audiopod

from audiopod import AudioPod

client = AudioPod()  # reads AUDIOPOD_API_KEY

# Generate a song
job = client.music.generate(
    prompt="lo-fi rainy 90 BPM",
    duration=60,
)
song = job.wait()
song.download("song.wav")

# Synthesize speech — pick any slug from the public voice catalog
speech = client.tts.synthesize(
    text="Welcome to AudioPod.",
    voice="aurora-warm-friendly",
)
speech.download("hello.wav")

TypeScriptnpm install audiopod

import AudioPod from "audiopod";

const client = new AudioPod(); // reads AUDIOPOD_API_KEY

// Generate a song
const job = await client.music.generate({
  prompt: "lo-fi rainy 90 BPM",
  duration: 60,
});
const song = await job.wait();
await song.download("song.wav");

// Synthesize speech — pick any slug from the public voice catalog
const speech = await client.tts.synthesize({
  text: "Welcome to AudioPod.",
  voice: "aurora-warm-friendly",
});
await speech.download("hello.wav");

Open standards everywhere

Every discovery surface an agent might check is published. No bespoke handshakes, no closed schemas.

Ship your first agent-driven audio in 60 seconds

Free credits. No card. No vendor lock-in. Open standards from the first request.

Get an API key Read the docs

AudioPod for Agents

Send your AI agent to AudioPod

Plug into any agent

One-line install

Or paste a snippet into a desktop client

Audio capabilities, one endpoint

generate_music

text_to_speech

clone_voice

separate_stems

transcribe_audio

denoise_audio

convert_media

narrate_audiobook

From the terminal

Two SDKs, identical surface

Open standards everywhere

Ship your first agent-driven audio in 60 seconds

Studio

Edit & process

Voices

Free tools

Solutions

Compare

Resources

Company & legal

AudioPod for Agents

Send your AI agent to AudioPod

Plug into any agent

One-line install

Or paste a snippet into a desktop client

Audio capabilities, one endpoint

generate_music

text_to_speech

clone_voice

separate_stems

transcribe_audio

denoise_audio

convert_media

narrate_audiobook

From the terminal

Two SDKs, identical surface

Open standards everywhere

Ship your first agent-driven audio in 60 seconds

Studio

Edit & process

Voices

Free tools

Solutions

Compare

Resources

Company & legal