How to Create Audiobooks with AI
Step-by-Step Guide (2026)
From manuscript to published audiobook in 1-3 days for under $30
~$30
Full 80K-word novel (Studio plan)
1-3 days
Total production time
10-20 sales
Break-even vs. 320-800 with human narrator
Why Now Is the Best Time to Release an Audiobook
$20B+
Audiobook industry by 2026
Fastest-growing format in publishing
45%+
Of digital book consumption
Readers want audio versions of every title
85-90%
Royalties via direct sales
Vs. 25% on ACX/Audible
ACX (Audible) and Findaway now explicitly accept AI-narrated audiobooks with disclosure. Most indie authors' backlist titles have no audio edition — a significant untapped revenue opportunity. Full audiobook TTS guide →
What You'll Need Before You Start
Your Manuscript
DOCX, PDF, TXT, or EPUB — paste or upload directly.
- Spell out numbers ("twenty-five" not "25")
- Add phonetic spellings for unusual names
- Remove header/footer artifacts
A SpeechGeneration AI Account
- Free: 10,000 chars (~1 chapter)
- Starter $5/mo: 100K chars (~novella)
- Studio $30/mo: 450K chars (~full novel)
Commercial rights included on all plans.
Audacity (Free)
Only needed for ACX submission. Normalizes LUFS, adds room tone, and exports platform-compliant MP3 files. Free and available on all operating systems.
How to Create Your Audiobook: 5 Steps
Choose Your Book Type & Voice Tier
Match your genre to the right quality tier. This determines cost, emotional expressiveness, and language availability.
| Book Type | Best Tier | Est. Cost |
|---|---|---|
| Self-published fiction | Studio+ (2×) | ~$40-60 |
| Non-fiction / self-help | Studio (1×) | ~$30 |
| Children's books | Studio+ (2×) | ~$5-15 |
| Backlist / catalog scaling | Economy (0.1×) | ~$10 |
| Literary fiction with dialogue | Studio+ (2×) | ~$40-60 |
Upload & Cast Your Narrator
Paste your text or upload your manuscript. Preview voices and select one that matches your genre.
Thrillers / Fantasy
Deep male — authoritative, commanding
Romance / Contemporary
Warm female — expressive, intimate
Non-fiction / Self-help
Neutral — clear, professional pacing
For dialogue-heavy fiction, set up a multi-voice project to assign distinct voices to each character.
Add Emotion Tags (Studio+ Only)
Place bracket tags before key lines to direct emotional delivery. Same words, different emotion = completely different scene.
| Tag | Effect |
|---|---|
| [whisper] | Intimate, secretive |
| [excited] | Energetic, joyful |
| [sad] | Melancholic, pensive |
| [angry] | Intense, confrontational |
| [calm] | Soothing, measured |
| [serious] | Authoritative, grave |
| [laugh] | Natural laughter |
| [pause] | Dramatic beat |
Generate & Quality Check
Generation time estimates
- 10,000 chars → ~30 seconds
- 50,000 chars → ~2-3 minutes
- 100,000 chars → ~5-10 minutes
Generate in 500-1,000 char chunks per chapter for consistency.
QA checklist
- Pronunciation of character names
- Pacing matches story tone
- Emotional delivery fits scenes
- Audio levels consistent
- No robotic artifacts
Export & Distribute
Export MP3 at 192 kbps. Normalize in Audacity. Upload with AI narration disclosure.
| Platform | Format | Loudness | Royalty |
|---|---|---|---|
| ACX / Audible | MP3 192 kbps | -23 LUFS | 25/75 split |
| Findaway / INaudio | MP3 / WAV | -16 LUFS | 80/20 |
| Apple Books | MP3 | -16 LUFS | 70/30 |
| Spotify for Authors | MP3 | Variable | Variable |
| BookFunnel (direct) | MP3 / WAV | Variable | 85-90% |
Maximum reach: ACX + Findaway combined. Highest royalties: BookFunnel direct sales (85-90%).
Choosing Your Quality Tier
Three tiers, three use cases. Pick the one that matches your book type and budget.
Economy (0.1×)
~$10 / 80K novel
- Use case: Backlist, NPCs, functional narration
- Languages: 15 languages
- Emotion: No emotion tags
- Best for: Non-fiction, bulk narration, drafts
Studio (1×)
~$30 / 80K novel
- Use case: Most books — recommended default
- Languages: 30+ languages
- Emotion: No emotion tags
- Best for: Non-fiction, self-help, contemporary fiction
Studio+ (2×)
~$40-60 / 80K novel
- Use case: Fiction with dialogue, children's books
- Languages: 70+ languages
- Emotion: 8+ emotion tags
- Best for: Literary fiction, character-driven narratives
Your First Audiobook Production Checklist
Before Generation
- Manuscript formatted and proofread
- Character names spelled phonetically if unusual
- Numbers spelled out (e.g. 'twenty-five' not '25')
- Voice tier selected based on book type
- Previewed at least 3 narrator voices
- Multi-voice project set up (if dialogue-heavy fiction)
- Emotion tags added to key lines (Studio+ only)
During Generation
- Generated first chapter as a test
- Listened for pronunciation issues
- Checked emotional delivery (Studio+)
- Verified pacing matches story tone
- Adjusted text and re-generated if needed
Post-Generation & Distribution
- Listened to full audiobook at normal speed
- Noted and fixed any mispronunciations
- Exported MP3 at 192 kbps
- Normalized LUFS in Audacity for platform
- Added room tone (0.5s) at start/end of each chapter file
- Prepared metadata and AI narration disclosure
AI Audiobook Creation FAQ
Yes. ACX accepts AI narration with disclosure. Upload MP3 at 192 kbps, -23 LUFS, add room tone, and disclose AI narration in the product description.
1-3 days for an 80,000-word novel: ~2-3 hours generation + QA time. Compare to 2-6 weeks with a human narrator.
Studio+ for fiction with significant dialogue. The emotion tags ([excited], [whisper], [serious]) transform flat delivery into expressive character narration.
Economy tier on the Starter plan ($5/mo) costs ~$10 for an 80,000-word novel. Best for backlist titles or non-fiction where clarity matters more than nuance.
Just Audacity (free) to normalize LUFS before ACX upload. Everything else — generation, export, MP3 conversion — is handled within SG.ai.
Yes. See the multi-voice TTS guide for assigning distinct voices to each character in your audiobook.
A simple line in your product description: 'This audiobook was narrated using AI text-to-speech technology.' ACX and Findaway both require this disclosure.
An 80,000-word novel is approximately 400,000-480,000 characters. The Studio plan ($30/mo, ~450K chars) covers one full novel per month.
Create Your Audiobook Free
10,000 characters free — about one chapter. No credit card required.
No credit card required · Commercial use included · 70+ languages