snakers4 / silero-modelsLinks
Silero Models: pre-trained text-to-speech models made embarrassingly simple
β5,655Updated this week
Alternatives and similar repositories for silero-models
Users that are interested in silero-models are comparing it to the libraries listed below
Sorting:
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ7,573Updated this week
- πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.β2,539Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,082Updated last year
- An Open Source text-to-speech system built by inverting Whisper.β4,539Updated 6 months ago
- Whisper realtime streaming for long speech-to-text transcription and translationβ3,470Updated 3 weeks ago
- Open STTβ813Updated 3 years ago
- A nearly-live implementation of OpenAI's Whisper.β3,644Updated 2 months ago
- Converts text to speech in realtimeβ3,656Updated 4 months ago
- Transcription, forced alignment, and audio indexing with OpenAI's Whisperβ2,086Updated last month
- End to end text to speech system using gruut and onnxβ831Updated 2 years ago
- eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.β5,904Updated 2 weeks ago
- AI powered speech denoising and enhancementβ2,089Updated last year
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.β3,996Updated 11 months ago
- Multilingual Automatic Speech Recognition with word-level timestamps and confidenceβ2,691Updated 3 months ago
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speakerβ¦β8,789Updated this week
- MARS5 speech model (TTS) from CAMB.AIβ2,807Updated last year
- Noise supression using deep filteringβ3,585Updated last year
- Foundational model for human-like, expressive TTSβ4,195Updated last year
- A python package to analyze and compare voices with deep learningβ3,172Updated 2 years ago
- π Text-prompted Generative Audio Model - With the ability to clone voicesβ3,334Updated 3 months ago
- AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advβ¦β2,160Updated 4 months ago
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ1,038Updated last year
- Controllable and fast Text-to-Speech for over 7000 languages!β2,053Updated 5 months ago
- A python package to build AI-powered real-time audio applicationsβ1,882Updated 9 months ago
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)β10,078Updated 2 years ago
- PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.htmlβ2,189Updated 3 months ago
- An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.β841Updated 2 years ago
- TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, Germaβ¦β3,988Updated last year
- Fast inference engine for Transformer modelsβ4,179Updated this week
- A Python/Pytorch app for easily synthesising human voicesβ1,441Updated last year