How to Create Audiobooks with AI

Step-by-Step Guide (2026)

From manuscript to published audiobook in 1-3 days for under $30

~$30

Full 80K-word novel (Studio plan)

1-3 days

Total production time

10-20 sales

Break-even vs. 320-800 with human narrator

Start Free — 10,000 Characters

Why Now Is the Best Time to Release an Audiobook

$20B+

Audiobook industry by 2026

Fastest-growing format in publishing

45%+

Of digital book consumption

Readers want audio versions of every title

85-90%

Royalties via direct sales

Vs. 25% on ACX/Audible

ACX (Audible) and Findaway now explicitly accept AI-narrated audiobooks with disclosure. Most indie authors' backlist titles have no audio edition — a significant untapped revenue opportunity. Full audiobook TTS guide →

What You'll Need Before You Start

Your Manuscript

DOCX, PDF, TXT, or EPUB — paste or upload directly.

  • Spell out numbers ("twenty-five" not "25")
  • Add phonetic spellings for unusual names
  • Remove header/footer artifacts

A SpeechGeneration AI Account

  • Free: 10,000 chars (~1 chapter)
  • Starter $5/mo: 100K chars (~novella)
  • Studio $30/mo: 450K chars (~full novel)

Commercial rights included on all plans.

Audacity (Free)

Only needed for ACX submission. Normalizes LUFS, adds room tone, and exports platform-compliant MP3 files. Free and available on all operating systems.

How to Create Your Audiobook: 5 Steps

1

Choose Your Book Type & Voice Tier

Match your genre to the right quality tier. This determines cost, emotional expressiveness, and language availability.

Book TypeBest TierEst. Cost
Self-published fictionStudio+ (2×)~$40-60
Non-fiction / self-helpStudio (1×)~$30
Children's booksStudio+ (2×)~$5-15
Backlist / catalog scalingEconomy (0.1×)~$10
Literary fiction with dialogueStudio+ (2×)~$40-60
2

Upload & Cast Your Narrator

Paste your text or upload your manuscript. Preview voices and select one that matches your genre.

Thrillers / Fantasy

Deep male — authoritative, commanding

Romance / Contemporary

Warm female — expressive, intimate

Non-fiction / Self-help

Neutral — clear, professional pacing

For dialogue-heavy fiction, set up a multi-voice project to assign distinct voices to each character.

3

Add Emotion Tags (Studio+ Only)

Place bracket tags before key lines to direct emotional delivery. Same words, different emotion = completely different scene.

TagEffect
[whisper]Intimate, secretive
[excited]Energetic, joyful
[sad]Melancholic, pensive
[angry]Intense, confrontational
[calm]Soothing, measured
[serious]Authoritative, grave
[laugh]Natural laughter
[pause]Dramatic beat
Pro tips: Use 1-2 tags per character turn. Combine tags: [excited] [whisper]. Don't over-tag — leave untagged lines for natural variation. Full emotion tag guide →
4

Generate & Quality Check

Generation time estimates

  • 10,000 chars → ~30 seconds
  • 50,000 chars → ~2-3 minutes
  • 100,000 chars → ~5-10 minutes

Generate in 500-1,000 char chunks per chapter for consistency.

QA checklist

  • Pronunciation of character names
  • Pacing matches story tone
  • Emotional delivery fits scenes
  • Audio levels consistent
  • No robotic artifacts
5

Export & Distribute

Export MP3 at 192 kbps. Normalize in Audacity. Upload with AI narration disclosure.

PlatformFormatLoudnessRoyalty
ACX / AudibleMP3 192 kbps-23 LUFS25/75 split
Findaway / INaudioMP3 / WAV-16 LUFS80/20
Apple BooksMP3-16 LUFS70/30
Spotify for AuthorsMP3VariableVariable
BookFunnel (direct)MP3 / WAVVariable85-90%

Maximum reach: ACX + Findaway combined. Highest royalties: BookFunnel direct sales (85-90%).

Choosing Your Quality Tier

Three tiers, three use cases. Pick the one that matches your book type and budget.

Economy (0.1×)

~$10 / 80K novel

  • Use case: Backlist, NPCs, functional narration
  • Languages: 15 languages
  • Emotion: No emotion tags
  • Best for: Non-fiction, bulk narration, drafts
Recommended

Studio (1×)

~$30 / 80K novel

  • Use case: Most books — recommended default
  • Languages: 30+ languages
  • Emotion: No emotion tags
  • Best for: Non-fiction, self-help, contemporary fiction

Studio+ (2×)

~$40-60 / 80K novel

  • Use case: Fiction with dialogue, children's books
  • Languages: 70+ languages
  • Emotion: 8+ emotion tags
  • Best for: Literary fiction, character-driven narratives

Your First Audiobook Production Checklist

Before Generation

  • Manuscript formatted and proofread
  • Character names spelled phonetically if unusual
  • Numbers spelled out (e.g. 'twenty-five' not '25')
  • Voice tier selected based on book type
  • Previewed at least 3 narrator voices
  • Multi-voice project set up (if dialogue-heavy fiction)
  • Emotion tags added to key lines (Studio+ only)

During Generation

  • Generated first chapter as a test
  • Listened for pronunciation issues
  • Checked emotional delivery (Studio+)
  • Verified pacing matches story tone
  • Adjusted text and re-generated if needed

Post-Generation & Distribution

  • Listened to full audiobook at normal speed
  • Noted and fixed any mispronunciations
  • Exported MP3 at 192 kbps
  • Normalized LUFS in Audacity for platform
  • Added room tone (0.5s) at start/end of each chapter file
  • Prepared metadata and AI narration disclosure

AI Audiobook Creation FAQ

Yes. ACX accepts AI narration with disclosure. Upload MP3 at 192 kbps, -23 LUFS, add room tone, and disclose AI narration in the product description.

1-3 days for an 80,000-word novel: ~2-3 hours generation + QA time. Compare to 2-6 weeks with a human narrator.

Studio+ for fiction with significant dialogue. The emotion tags ([excited], [whisper], [serious]) transform flat delivery into expressive character narration.

Economy tier on the Starter plan ($5/mo) costs ~$10 for an 80,000-word novel. Best for backlist titles or non-fiction where clarity matters more than nuance.

Just Audacity (free) to normalize LUFS before ACX upload. Everything else — generation, export, MP3 conversion — is handled within SG.ai.

Yes. See the multi-voice TTS guide for assigning distinct voices to each character in your audiobook.

A simple line in your product description: 'This audiobook was narrated using AI text-to-speech technology.' ACX and Findaway both require this disclosure.

An 80,000-word novel is approximately 400,000-480,000 characters. The Studio plan ($30/mo, ~450K chars) covers one full novel per month.

Create Your Audiobook Free

10,000 characters free — about one chapter. No credit card required.

No credit card required · Commercial use included · 70+ languages