ElevenLabs
70+ Languages + Inline Tagsby ElevenLabs

ElevenLabs v3

ElevenLabs's latest TTS — 70+ languages, 20 named voices (Aria, Roger, Sarah, ... Bill), inline audio tags ([laughs], [whispers], [excited]), and word-level timestamps for subtitling. Broadcast-quality output at $0.10 per 1,000 characters.

Model Specs

Released
Jun 2025
Voices
20
Languages
29
Max characters
5K
Modalities
textaudio

About this model

ElevenLabs v3 (Eleven v3) is ElevenLabs's latest text-to-speech model and the most feature-complete TTS option on Renas AI. Three capabilities make it distinctive: (1) **70+ languages** including English, Mandarin, Hindi, Arabic, Spanish, French, German, Japanese, Korean, and dozens more — the broadest language coverage in any Renas TTS model; (2) **inline audio tags** like `[laughs]`, `[whispers]`, `[excited]`, `[sad]` for fine-grained emotional direction within the script; (3) **word-level timestamps** for subtitling and lip-sync workflows where you need to know exactly when each word is spoken.

The model offers 20 named voices: Aria, Roger, Sarah, Laura, Charlie, George, Callum, River, Liam, Charlotte, Alice, Matilda, Will, Jessica, Eric, Chris, Brian, Daniel, Lily, Bill — with Rachel as the default. Output formats include MP3, PCM, µ-law, A-law, and Opus codecs at various sample rates and bitrates (default mp3_44100_128). Pricing is $0.10 per 1,000 characters — same as Multilingual v2 — with no seat licenses, subscriptions, or minimums. ElevenLabs's voice cloning (Instant and Professional tiers) is referenced but typically accessed at the ElevenLabs platform level rather than via the fal.ai integration.

Reach for ElevenLabs v3 when (a) you need broad multilingual TTS coverage (70+ languages), (b) your content benefits from inline emotional direction via audio tags, (c) you're producing subtitled or lip-synced content where word timestamps matter, or (d) you want the latest ElevenLabs voice quality. For cost-sensitive workflows, Kokoro ($0.02) or Dia ($0.04); for established workflows on the older multilingual v2, that's still available on Renas at the same price.

Key Strengths

70+ language coverage

Broadest multilingual support on Renas TTS — English, Mandarin, Hindi, Arabic, Spanish, French, German, Japanese, Korean, and dozens more. Useful for international content workflows, multilingual podcasts, and global brand audio.

20 named voices

Hand-curated voice library: Aria, Roger, Sarah, Laura, Charlie, George, Callum, River, Liam, Charlotte, Alice, Matilda, Will, Jessica, Eric, Chris, Brian, Daniel, Lily, Bill. Default is Rachel. Pick voices by name and character — easier than navigating large unnamed libraries.

Inline audio tags for emotion

Direct emotional direction within the script via tags like `[laughs]`, `[whispers]`, `[excited]`, `[sad]`. Combined with broad dynamic range, produces expressive output without separate parameter tuning.

Word-level timestamps

Output includes precise per-word timing — essential for subtitling workflows, lip-sync video production, and any context where you need to know exactly when each word is spoken.

Multiple output codecs

MP3, PCM, µ-law, A-law, and Opus formats at various sample rates and bitrates. Default mp3_44100_128 covers most use cases; specialized codecs available for telephony (µ-law/A-law) or low-bandwidth (Opus).

Contextual emotional understanding

Beyond explicit tags, v3 reads narrative cues from context — adjusting tone based on the surrounding text without requiring you to tag every emotional shift.

Text-to-Speech

Voice synthesis capabilities

Available voices, languages, and expressive controls.

Voices
20
ready-to-use voice profiles
Languages
29
supported
EnglishArabicBulgarianChineseCroatianCzech
Max characters
5,000
per request

How it compares

ElevenLabs v3 is the most feature-rich TTS on Renas. Compare against alternatives based on language coverage, cost, and feature requirements.

vs. ModelVerdictOutcome

Pros

  • 70+ language coverage — broadest on Renas TTS
  • 20 hand-curated named voices
  • Inline audio tags ([laughs], [whispers], [excited], [sad]) for emotion
  • Word-level timestamps for subtitling and lip-sync
  • Multiple output codecs (MP3, PCM, µ-law, A-law, Opus)
  • Contextual emotional understanding from narrative cues
  • Broadcast-quality voice synthesis

Things to consider

  • Premium pricing ($0.10/1K chars) — 2.5-5x more than Dia/Kokoro
  • Max input characters not specified in fal.ai page
  • Voice cloning happens at ElevenLabs platform level, not via fal.ai integration
  • No multi-speaker dialogue tags (Dia has [S1]/[S2])
  • Specific accent and voice variations within named voices not fully documented

Best use cases

Multilingual content production

Generate the same content in multiple languages — Spanish marketing video, Mandarin product descriptions, Arabic news content. 70+ languages lets you scale to global audiences from a single TTS model.

Subtitled and lip-synced video

Word-level timestamps enable precise subtitle generation that matches the audio. Useful for accessibility-compliant video, multilingual subtitles, or animation/lip-sync workflows.

Audio dramas and narrative content

Inline audio tags ([laughs], [whispers], [sighs]) give fine-grained emotional control for character-driven audio. Combine with 20 named voices for casts of characters in audiobooks, podcasts, or audio dramas.

Premium podcast production

Broadcast-quality output + 20 named voices + emotional control fits podcast workflows where production quality matters. Use Rachel as default narrator, switch to other voices for guests or characters.

International marketing campaigns

Generate brand voice content across all your target markets in a single workflow. Same script, different language outputs — useful for global ad campaigns and content marketing at scale.

Accessibility content

Audio descriptions for visual content, screen reader-friendly content alternatives, multilingual accessibility transcripts. 70+ language support combined with broadcast quality makes v3 a strong accessibility tool.

How to use it on Renas AI

  1. 1

    Step 1

    Open the AI Voice tool in TTS mode

    Navigate to AI Voice in the Renas dashboard, then switch to Text-to-Speech mode. Pick ElevenLabs v3 from the model selector — it's marked as the latest ElevenLabs variant.

  2. 2

    Step 2

    Pick voice and language

    Choose from 20 named voices (Aria, Roger, Sarah, etc.) — Rachel is default. The 70+ language support is automatic — write in your target language and v3 handles it.

  3. 3

    Step 3

    Add inline audio tags for emotion

    Direct emotional delivery in your script: `Hello, [excited] welcome to the show!` or `[whispers] This is a secret. [laughs] Just kidding.` Tags like [laughs], [whispers], [excited], [sad] work inline.

  4. 4

    Step 4

    Generate, review, deploy

    Output goes to your asset library in your selected codec (default MP3). For subtitling, capture word-level timestamps from the API response. For video lip-sync, use the timestamps to align audio with visual elements.

Pricing

Pricing on Renas AI

Pay-as-you-go credits, no API keys, no rate limits.

293credits per 1K chars
Included in every paid plan
No separate API key or setup
Predictable per-word credit cost
Commercial use rights for all output

Frequently asked questions

70+ language premium AI voice

Use ElevenLabs v3 with your Renas AI subscription credits — no API key, no setup, no per-seat fees.

Try ElevenLabs v3
ElevenLabs v3 — Premium Multilingual TTS with Audio Tags | Renas AI | Renas AI