coqui-ai / TrainerLinks
πΈ - A general purpose model trainer, as flexible as it gets
β229Updated last year
Alternatives and similar repositories for Trainer
Users that are interested in Trainer are comparing it to the libraries listed below
Sorting:
- β382Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.β329Updated last year
- NeMo text processing for ASR and TTSβ399Updated this week
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ260Updated last month
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorchβ275Updated 2 years ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.β178Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS modelsβ169Updated last year
- β156Updated this week
- β261Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationβ151Updated last year
- Official Implementation of StyleTTSβ456Updated 11 months ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ131Updated 2 years ago
- Your one-stop solution for voice dataset creationβ128Updated 2 years ago
- Desktop application for neural speech synthesis written in C++β212Updated 2 years ago
- PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speechβ341Updated 3 years ago
- [WIP] VoiceSmith makes training text to speech models easy.β228Updated 3 years ago
- Open models for Coqui STTβ148Updated 2 years ago
- β275Updated last year
- Grapheme to phoneme conversion with deep learning.β409Updated 2 years ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.β588Updated 2 years ago
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official codeβ203Updated 3 years ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised durationβ¦β326Updated 3 years ago
- β359Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β137Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.β374Updated last year
- On-device voice activity detection (VAD) powered by deep learningβ237Updated this week
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictionsβ263Updated 11 months ago
- Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, β¦β291Updated 2 years ago
- πΈSTT integration examplesβ129Updated 3 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ107Updated last week