DigitalPhonetics / IMS-ToucanLinks
Controllable and fast Text-to-Speech for over 7000 languages!
β1,597Updated last week
Alternatives and similar repositories for IMS-Toucan
Users that are interested in IMS-Toucan are comparing it to the libraries listed below
Sorting:
- Interface for OuteTTS models.β1,283Updated this week
- [ICASSP 2024] π΅ Matcha-TTS: A fast TTS architecture with conditional flow matchingβ1,012Updated this week
- Converts text to speech in realtimeβ3,110Updated 2 weeks ago
- Inference and training library for high-quality TTS models.β5,261Updated 5 months ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".β908Updated 7 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ715Updated 5 months ago
- MARS5 speech model (TTS) from CAMB.AIβ2,760Updated 10 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisβ929Updated 9 months ago
- β1,128Updated 3 months ago
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorchβ651Updated 7 months ago
- Local SRT/LLM/TTS Voicechatβ680Updated 7 months ago
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ974Updated 6 months ago
- Unified-Modal Speech-Text Pre-Training for Spoken Language Processingβ1,357Updated last year
- first base model for full-duplex conversational audioβ1,746Updated 4 months ago
- Whisper with Medusa headsβ838Updated last month
- AI powered speech denoising and enhancementβ1,802Updated 5 months ago
- VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Designβ564Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ5,750Updated 9 months ago
- An Open Source text-to-speech system built by inverting Whisper.β4,257Updated last month
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,603Updated 10 months ago
- β359Updated 8 months ago
- TTS with kokoro and onnx runtimeβ2,008Updated 3 weeks ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.β581Updated last year
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ567Updated last month
- Local realtime voice AIβ2,317Updated 2 months ago
- unofficial vits2-TTS implementation in pytorchβ523Updated last year
- StreamSpeech is an βAll in Oneβ seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.β1,078Updated 9 months ago
- open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streamingβ¦β3,327Updated 6 months ago
- Official Implementation of StyleTTSβ432Updated 4 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.β3,864Updated 4 months ago