coqui-ai / TrainerLinks
πΈ - A general purpose model trainer, as flexible as it gets
β223Updated last year
Alternatives and similar repositories for Trainer
Users that are interested in Trainer are comparing it to the libraries listed below
Sorting:
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorchβ274Updated last year
- β378Updated last year
- Official Implementation of StyleTTSβ452Updated 8 months ago
- NeMo text processing for ASR and TTSβ376Updated last week
- β262Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.β327Updated 10 months ago
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ257Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.β370Updated last year
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.β177Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationβ149Updated last year
- [WIP] VoiceSmith makes training text to speech models easy.β226Updated 3 years ago
- PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speechβ341Updated 3 years ago
- Your one-stop solution for voice dataset creationβ125Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS modelsβ163Updated last year
- MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translationβ393Updated 2 years ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised durationβ¦β326Updated 3 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β404Updated last year
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official codeβ202Updated 3 years ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.β587Updated 2 years ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictionsβ260Updated 8 months ago
- Grapheme to phoneme conversion with deep learning.β401Updated last year
- πΈSTT integration examplesβ129Updated 3 years ago
- β133Updated 2 weeks ago
- Faster Tortoise inference then Tortoise Fast Forkβ127Updated last year
- Desktop application for neural speech synthesis written in C++β213Updated 2 years ago
- A summary on our attempts at using Deep Learning approaches for Emotional Text to Speechβ458Updated last year
- β274Updated last year
- On-device voice activity detection (VAD) powered by deep learningβ230Updated 2 weeks ago
- Open models for Coqui STTβ144Updated 2 years ago
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorchβ665Updated last year