lukerbs / forcealignLinks
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆25Updated last year
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆104Updated last year
- An unofficial PyTorch implementation of VALL-E☆88Updated 5 months ago
- Putting flows on top of neural transducers for better TTS☆64Updated last month
- ☆58Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆185Updated last week
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆92Updated 3 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 7 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆20Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- Monotonic Alignment Search☆100Updated 7 months ago
- Official Code for ParrotTTS☆58Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆111Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆103Updated 9 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- ☆105Updated 3 months ago
- Finetuning VITS Efficiently☆33Updated 2 years ago
- ☆29Updated 11 months ago
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆76Updated last year
- Chinese and English Bilinguish G2P☆22Updated 2 years ago
- Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.☆53Updated 9 months ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆53Updated 2 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago
- ☆70Updated 2 years ago
- ☆71Updated 2 years ago
- multilingual speech aligner☆76Updated 2 years ago
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆25Updated last year
- paraformer(chinense asr) online onnx runtime for python☆53Updated last year
- Official implementation of the TTS model Lina-Speech☆175Updated last year
- All generative model in one for better TTS model☆74Updated last year
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆75Updated 2 years ago