coqui-ai / TrainerLinks
πΈ - A general purpose model trainer, as flexible as it gets
β223Updated last year
Alternatives and similar repositories for Trainer
Users that are interested in Trainer are comparing it to the libraries listed below
Sorting:
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorchβ273Updated last year
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ255Updated last year
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.β176Updated last year
- β262Updated last year
- β377Updated 11 months ago
- NeMo text processing for ASR and TTSβ359Updated last week
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β137Updated 2 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.β323Updated 9 months ago
- PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speechβ340Updated 3 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.β363Updated last year
- Official Implementation of StyleTTSβ443Updated 7 months ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ132Updated 2 years ago
- β123Updated this week
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official codeβ202Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationβ148Updated last year
- On-device voice activity detection (VAD) powered by deep learningβ227Updated 2 weeks ago
- Open models for Coqui STTβ141Updated 2 years ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS modelsβ162Updated last year
- Your one-stop solution for voice dataset creationβ123Updated last year
- [WIP] VoiceSmith makes training text to speech models easy.β225Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β102Updated 10 months ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictionsβ260Updated 7 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ150Updated last year
- β273Updated last year
- Repository for the paper: VoiceMe: Personalized voice generation in TTSβ125Updated 3 years ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised durationβ¦β325Updated 2 years ago
- Desktop application for neural speech synthesis written in C++β215Updated 2 years ago
- Grapheme to phoneme conversion with deep learning.β394Updated last year
- Faster Tortoise inference then Tortoise Fast Forkβ128Updated last year
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speechβ229Updated 3 years ago