lukerbs / forcealignLinks
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆23Updated 11 months ago
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- An unofficial PyTorch implementation of VALL-E☆88Updated 3 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆103Updated last year
- ☆57Updated last year
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆89Updated last month
- Putting flows on top of neural transducers for better TTS☆64Updated this week
- Finetuning VITS Efficiently☆33Updated 2 years ago
- Official Code for ParrotTTS☆58Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆109Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆179Updated this week
- Timething is a library for aligning text transcripts with their audio recordings.☆126Updated 11 months ago
- ☆29Updated 9 months ago
- ☆102Updated last month
- a lightweight voice conversion☆85Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- ☆19Updated 8 months ago
- ☆71Updated 2 years ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆52Updated 2 years ago
- Chinese and English Bilinguish G2P☆21Updated 2 years ago
- ☆69Updated last year
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆130Updated 2 years ago
- TransferTTS (Zero-Shot learning of VITS)☆101Updated 3 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆55Updated 6 months ago
- Barkify: an unoffical training implementation of Bark TTS by suno-ai☆127Updated 2 years ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆19Updated last year
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆192Updated 4 months ago
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆77Updated last year
- demo page https://MingjieChen.github.io/dygan-vc☆67Updated 3 years ago
- ☆56Updated 2 years ago
- multilingual speech aligner☆77Updated 2 years ago