Bark
Transformer-based text-to-audio model by Suno. Can generate speech, music, and sound effects.
38.0kstars
text-to-speechai-musicvoice-synthesis
Advantages
- +Generates non-verbal sounds (laughter, sighs, background)
- +Natural multilingual synthesis
- +Strong technical foundation from Suno team
Limitations
- -Extremely slow inference (minutes for long text)
- -Insufficient voice consistency control
- -High VRAM consumption