snakers4 / silero-modelsLinks
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
β5,325Updated last year
Alternatives and similar repositories for silero-models
Users that are interested in silero-models are comparing it to the libraries listed below
Sorting:
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ6,004Updated 2 months ago
- πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.β2,448Updated last year
- Open STTβ798Updated 3 years ago
- TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, Germaβ¦β3,935Updated 11 months ago
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speakerβ¦β7,669Updated last week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ5,775Updated 10 months ago
- π A list of accessible speech corpora for ASR, TTS, and other Speech Technologiesβ1,331Updated last year
- Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Eβ¦β1,787Updated 2 years ago
- π€π¬ Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.β1,149Updated last year
- End-to-End Speech Processing Toolkitβ9,172Updated 2 weeks ago
- A PyTorch-based Speech Toolkitβ9,948Updated last week
- WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi librariesβ1,073Updated 2 weeks ago
- A multi-voice TTS system trained with an emphasis on qualityβ14,248Updated 6 months ago
- Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorchβ1,606Updated last year
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ979Updated 7 months ago
- Converts text to speech in realtimeβ3,151Updated 3 weeks ago
- A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.β1,753Updated 7 months ago
- A python package to build AI-powered real-time audio applicationsβ1,322Updated 4 months ago
- π Text-prompted Generative Audio Model - With the ability to clone voicesβ3,304Updated last year
- An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.β836Updated last year
- Controllable and fast Text-to-Speech for over 7000 languages!β1,606Updated 3 weeks ago
- End to end text to speech system using gruut and onnxβ829Updated last year
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.β3,710Updated last year
- Whisper realtime streaming for long speech-to-text transcription and translationβ2,961Updated 5 months ago
- TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwβ¦β980Updated 2 weeks ago
- On-device wake word detection powered by deep learningβ4,163Updated this week
- Python interface to the WebRTC Voice Activity Detectorβ2,259Updated 11 months ago
- Foundational model for human-like, expressive TTSβ4,131Updated 10 months ago
- An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"β2,039Updated last year
- WaveRNN Vocoder + TTSβ2,166Updated 2 years ago