Narration & Voice Play
Add voice narration and voice-driven play to your story. Readers can listen to scenes being read aloud with different voices for different characters, and select choices by speaking.
Getting Started
Open the Narration panel from the workshop sidebar. From here you can configure voices, generate TTS audio, record narration, and manage segments.
Configuring Voices
Voice configuration controls how TTS (text-to-speech) generation sounds. If you plan to record or upload your own audio instead, you can skip this section.
Narrator Voice
The narrator voice reads all scene text that isn’t marked as dialogue. Each language gets one narrator voice.
- In the Narration panel, select a language from the dropdown.
- Under Speed, adjust the speaking rate (0.8x to 1.2x).
- Under Voice, choose from the available voices.
- Click Preview to hear a sample.
- Click Save to apply.
Dialogue Voices
You can create named voices for character dialogue. When you mark text in the scene editor as being spoken by a character, that voice is used during narration.
- In the Narration panel, click Add Voice.
- Give it a name (e.g., “Elena”, “Guard Captain”).
- Choose a voice and speed.
- Each voice gets a color automatically, used for visual highlighting in the editor. You can change the color by clicking a swatch.
If a voice mark in the editor doesn’t match any dialogue voice name, the narrator voice is used as a fallback.
Marking Dialogue in the Editor
To indicate which parts of your scene text are spoken by an NPC or character:
- Select the text in the scene editor.
- In the bubble menu that appears, click the voice dropdown.
- Choose the character who is speaking.
- The text gets a colored underline matching that voice’s color.
To remove a voice assignment, select the marked text and click Remove Voice in the bubble menu. If that text had a narration segment generated for it, the segment is also deleted.
Expressive Speech
The TTS engine supports expressive tags that control how text is spoken. There are two types: delivery styles that change how a passage sounds, and narration effects that insert standalone audio like laughter or sighs.
Both types are purely for narration. Readers never see them during gameplay.
Delivery Styles
Delivery styles change how the TTS voice speaks a passage of text. For example, you can make a line whispered, shouted, or sung.
- Select the text you want to modify.
- In the bubble menu, click Emotion.
- Choose a style from the dropdown. Options are grouped by category:
- Delivery Style: Whisper, Soft, Loud, Slow, Fast, Higher Pitch, Lower Pitch, Emphasis, Sing-Song, Singing, Laugh While Speaking
- Intensity: Build Intensity, Decrease Intensity
- The text gets an amber underline to show the style is applied.
To change the style, click the marked text and use the Change button. To remove it, click the trash icon.
You can combine delivery styles with voice marks. For example, you could mark a villain’s dialogue with their voice and also set it to “Whisper”. Both are applied during generation.
Narration Effects
Narration effects insert a standalone sound at a specific point in the narration. Use them for things like a character laughing between sentences, a dramatic pause, or a sigh.
- Type
@in the scene editor to open the mention dropdown. - Scroll to the Narration Effects section (below Sound Effects).
- Type to filter, then press Enter or click to insert.
Available effects: Laugh, Chuckle, Giggle, Cry, Sigh, Breath, Inhale, Exhale, Tsk, Tongue Click, Lip Smack, Hum, Pause, Long Pause.
The effect appears as a small amber pill in the editor. During generation, it becomes its own audio segment. To delete it, place your cursor next to it and press Backspace or Delete, or click it and use the delete button in the bubble menu.
Generating TTS Narration
Once voices are configured:
- In the Narration panel, click Narrate Current Scene or Narrate All Scenes.
- The system converts your scene text to speech using the configured voices, splitting it into segments by voice role.
- When generation completes, the panel automatically switches to the Play tab where segments appear as an ordered list.
Dialogue sections use the assigned voice, and everything else uses the narrator voice.
Staleness Detection
If you edit a scene’s text or change voice settings after generating narration, affected segments are marked as Stale with a warning badge. This means the audio no longer matches the current text.
To update a stale segment, click the refresh button (↻) that appears next to its play button. This regenerates just that segment with the updated text, keeping it in the same position. You can also regenerate the entire scene with Narrate Current Scene.
Multi-Language Generation
Select a different language from the dropdown to generate narration for translations. If the selected language doesn’t have voice configs set up, the primary language’s voice configs are used.
If the current scene hasn’t been translated for the selected language, a warning is shown and the Narrate Current Scene button is disabled. Translate the scene first in the Translations panel.
Recording Narration
To record narration with your microphone:
- Navigate to the scene you want to narrate.
- In the Narration panel, click Record.
- Speak into your microphone. Click Stop when finished.
- Preview the recording, then click Save.
Recorded narrations replace any existing narration for that scene and language.
Uploading Audio Files
You can upload pre-recorded audio:
- Navigate to the scene you want to narrate.
- In the Narration panel, click Upload or drag and drop a file.
- Select an audio file (MP3, WAV, OGG, WebM).
Per-Segment Narration
Instead of generating or recording audio for an entire scene at once, you can build up narration segment by segment. This lets you mix approaches - for example, using TTS for the narrator while recording real voice acting for dialogue.
Creating Segments from Selected Text
- Select a passage of text in the scene editor.
- In the bubble menu that appears, click Narrate.
- The Narration panel switches to the Record tab and shows a Narrate Selection card with a preview of your selected text.
- From here you can:
- Click Generate TTS to create a TTS segment for just that text.
- Use the Record or Upload controls below to record or upload audio for that segment instead.
- The new segment appears in the Play tab alongside any other segments for the scene.
You can repeat this for different passages - select the narrator prose, generate TTS, then select a character’s dialogue and record your own voice for it. Each becomes its own segment.
Reordering and Managing Segments
In the Play tab, segments are listed in the order they will play:
- Drag segments to reorder them.
- Expand a segment (click the row) to see the source text it was generated from.
- Delete individual segments with the × button.
This lets you build a scene’s narration from a mix of TTS, recorded, and uploaded segments, arranged in whatever order sounds right.
Playing Narration in the Workshop
The Play tab shows all segments for the current scene. You can:
- Click Play to play all segments in order, with background music ducked automatically.
- Play individual segments using the play button on each row.
- Pause, resume, or stop playback at any time.
- Drag segments to reorder them.
- Expand a segment to see its source text.
- Delete individual segments.
Overall playback and individual segment playback are mutually exclusive. Playing a segment pauses the overall narration, and starting overall narration stops any individual segment.
Background music pauses when narration is paused and resumes when narration resumes.
Conditional Segments
If your scene uses conditional text blocks, narration segments inherit those conditions. During playback, segments are filtered based on the reader’s current game state, so only relevant segments play.
Reader Experience
Narration Playback
When narration is enabled and a reader reaches a scene with narration audio:
- The narration plays automatically.
- Background music volume drops while narration is playing.
- When narration finishes, background music returns to normal volume.
Readers can control narration in their game settings:
- Enable/Disable narration audio.
- Volume slider for narration.
- Show Text toggle to display scene text alongside audio.
- Auto-open Microphone toggle to automatically activate the microphone after narration finishes.
Voice Input
Readers can select choices by speaking:
- When choices are displayed, a microphone button appears.
- Tap the microphone and speak the choice you want. Any active narration playback is paused automatically.
- The system transcribes the speech and matches it to the closest choice.
- If the match is confident enough, the choice is selected automatically.
Voice input uses fuzzy matching, so readers don’t need to say the exact choice text. The system handles common speech variations like contractions (“don’t” vs “do not”), hyphenated words, and minor punctuation differences.