supertone-inc / supertonicView external linksLinks
Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.
☆2,588Jan 22, 2026Updated 3 weeks ago
Alternatives and similar repositories for supertonic
Users that are interested in supertonic are comparing it to the libraries listed below
Sorting:
- LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10 languages: Chinese English Spanish Russian French German Ital…☆91Jan 14, 2026Updated last month
- TTS model capable of streaming conversational audio in realtime.☆1,059Nov 29, 2025Updated 2 months ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆33Sep 9, 2025Updated 5 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆19,109Nov 19, 2025Updated 2 months ago
- Trainging, inference, and testing of the SAC speech codec model.☆96Nov 1, 2025Updated 3 months ago
- Towards Human-Sounding Speech☆5,944Dec 5, 2025Updated 2 months ago
- Interface for OuteTTS models.☆1,421Jun 21, 2025Updated 7 months ago
- ☆99Jan 19, 2026Updated 3 weeks ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆108Jan 17, 2025Updated last year
- Soprano: Instant, Ultra-Realistic Text-to-Speech☆1,177Jan 15, 2026Updated last month
- SOTA Open Source TTS☆24,863Feb 2, 2026Updated 2 weeks ago
- A TTS that fits in your CPU (and pocket)☆3,134Updated this week
- Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expres…☆7,160Mar 5, 2025Updated 11 months ago
- SoTA open-source TTS☆22,571Feb 3, 2026Updated last week
- Inference and training library for high-quality TTS models.☆5,533Dec 10, 2024Updated last year
- Controllable and fast Text-to-Speech for over 7000 languages!☆2,183Jan 25, 2026Updated 3 weeks ago
- Soprano-Factory: Train your own 2000x realtime text-to-speech model☆206Jan 13, 2026Updated last month
- A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!☆110Nov 24, 2025Updated 2 months ago
- https://hf.co/hexgrad/Kokoro-82M☆5,625Aug 6, 2025Updated 6 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,164Aug 10, 2024Updated last year
- Silero VAD: pre-trained enterprise-grade Voice Activity Detector☆8,176Updated this week
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆49Sep 2, 2025Updated 5 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆14,079Updated this week
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆44,516Aug 16, 2024Updated last year
- A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using …☆297Dec 15, 2025Updated 2 months ago
- On-device TTS model by Neuphonic☆4,794Updated this week
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆83Oct 11, 2024Updated last year
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆151Jan 27, 2026Updated 2 weeks ago
- ☆390Nov 2, 2025Updated 3 months ago
- ☆6,065Aug 29, 2025Updated 5 months ago
- Foundational model for human-like, expressive TTS☆4,190Jul 30, 2024Updated last year
- Open-Source Frontier Voice AI☆23,186Feb 7, 2026Updated last week
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆38Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆9,593Feb 9, 2026Updated last week
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆186Oct 26, 2025Updated 3 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆106Oct 9, 2024Updated last year
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆35,918Apr 19, 2025Updated 9 months ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆22Feb 7, 2026Updated last week
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆31Aug 30, 2025Updated 5 months ago