hitz-zentroa / whisper-lmLinks
Add n-gram and large language model (LLM) support to Whisper models.
☆31Updated 3 months ago
Alternatives and similar repositories for whisper-lm
Users that are interested in whisper-lm are comparing it to the libraries listed below
Sorting:
- Official Code for ParrotTTS☆54Updated 10 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆18Updated 9 months ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆103Updated last year
- ☆25Updated this week
- ☆43Updated 11 months ago
- An unofficial PyTorch implementation of VALL-E☆88Updated 3 weeks ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆83Updated last week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆102Updated 10 months ago
- ☆68Updated 11 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- ☆29Updated 6 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆53Updated 3 months ago
- Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆151Updated 3 weeks ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆73Updated 9 months ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆58Updated 10 months ago
- ☆57Updated last year
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆94Updated 9 months ago
- ☆80Updated 2 months ago
- ☆13Updated last year
- [TAFFC 2025] The official implementation of EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vec…☆109Updated 4 months ago
- Putting flows on top of neural transducers for better TTS☆63Updated 2 weeks ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆172Updated last year
- Collection of scripts from mHuBERT-147.☆29Updated 9 months ago
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆47Updated 8 months ago
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆98Updated 3 months ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆60Updated 9 months ago
- Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"☆70Updated 3 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆148Updated last year
- ☆69Updated last month
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆57Updated last year