How to Tune Stability and Similarity for Better-Sounding Voiceovers
Adjust the voice setting sliders to balance consistency against expressiveness and fix robotic or unstable output.
Two settings shape almost everything about how an ElevenLabs voice sounds: Stability and Similarity. Getting them right is the difference between a flat, robotic read and a believable performance. This guide explains what each one does and how to dial them in.
What you need
- Any ElevenLabs voice open in Text to Speech or Studio
- A short test sentence with some emotion in it
- A few minutes to compare generations
Step 1: Open the voice settings
Below the voice picker, expand the settings panel. You will see Stability and Similarity sliders, and on some models a Style and Speaker boost control as well.
Step 2: Understand stability
Stability controls variation between generations. Low stability lets the voice be more emotional and varied but can occasionally glitch or wander. High stability is steady and repeatable but can sound monotone. For narration, mid to high works; for lively characters, go lower.
Step 3: Understand similarity
Similarity controls how closely the output sticks to the original voice's character. Higher values track the source voice tightly, which is usually what you want for a clone. Pushing it too high can amplify artefacts from a noisy sample.
Step 4: A/B test the same line
Generate your test sentence, change one slider, and generate again. Changing only one variable at a time tells you which slider caused the difference. Keep the test sentence identical so the comparison is fair.
Result: a voice that holds its character across a long script without drifting into either a robotic monotone or random glitches.
Watch related tutorials
12:48
10:22
14:20
12:45
18:30
23:00