lukerbs / forcealignLinks
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆25Updated last year
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- Finetuning VITS Efficiently☆33Updated 2 years ago
- An unofficial PyTorch implementation of VALL-E☆88Updated 4 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- Putting flows on top of neural transducers for better TTS☆64Updated 2 weeks ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆104Updated last year
- Official Code for ParrotTTS☆58Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 6 months ago
- ☆58Updated last year
- Chinese and English Bilinguish G2P☆22Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆109Updated last year
- ☆56Updated 2 years ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆52Updated 2 years ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆91Updated 2 months ago
- ☆71Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- ☆103Updated 2 months ago
- ☆23Updated last year
- a lightweight voice conversion☆85Updated last year
- ☆70Updated 2 years ago
- TransferTTS (Zero-Shot learning of VITS)☆101Updated 3 years ago
- 4G GPU & 10 Minutes for train☆12Updated 2 years ago
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆75Updated 2 years ago
- Monotonic Alignment Search☆100Updated 6 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆182Updated 2 weeks ago
- ☆44Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆76Updated last year
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆95Updated last year
- All generative model in one for better TTS model☆74Updated last year
- ☆29Updated 10 months ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago