coqui-ai / TrainerLinks
πΈ - A general purpose model trainer, as flexible as it gets
β229Updated last year
Alternatives and similar repositories for Trainer
Users that are interested in Trainer are comparing it to the libraries listed below
Sorting:
- β261Updated last year
- β379Updated last year
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorchβ275Updated 2 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.β328Updated last year
- Official Implementation of StyleTTSβ455Updated 10 months ago
- NeMo text processing for ASR and TTSβ396Updated last week
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ259Updated 3 weeks ago
- Desktop application for neural speech synthesis written in C++β212Updated 2 years ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.β178Updated last year
- Open models for Coqui STTβ148Updated 2 years ago
- Your one-stop solution for voice dataset creationβ128Updated 2 years ago
- [WIP] VoiceSmith makes training text to speech models easy.β227Updated 3 years ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ130Updated 2 years ago
- β157Updated 3 weeks ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS modelsβ169Updated last year
- PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speechβ342Updated 3 years ago
- Putting flows on top of neural transducers for better TTSβ64Updated this week
- β275Updated last year
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official codeβ203Updated 3 years ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speechβ232Updated 3 years ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β255Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationβ151Updated last year
- Faster Tortoise inference then Tortoise Fast Forkβ128Updated last year
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised durationβ¦β326Updated 3 years ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictionsβ263Updated 10 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ153Updated last year
- Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, β¦β291Updated 2 years ago
- The reproduced code for Google's SoundStormβ269Updated 2 years ago
- Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.β359Updated 2 years ago
- On-device voice activity detection (VAD) powered by deep learningβ235Updated this week