lukerbs / forcealign
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆13Updated 4 months ago
Alternatives and similar repositories for forcealign:
Users that are interested in forcealign are comparing it to the libraries listed below
- ☆20Updated 6 months ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Updated 2 years ago
- GPT for FACodec☆13Updated last year
- Official Code for ParrotTTS☆48Updated 6 months ago
- A collection of all our phonemeizers for dataset construction and inference☆22Updated 2 months ago
- 'Grad-TTS' with Multilingual Cleaners☆10Updated last year
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆27Updated 8 months ago
- Open Source Speech/Text Data on AI☆18Updated 2 years ago
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Updated last year
- Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model☆21Updated last week
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆32Updated 10 months ago
- Just another FastSpeech 2 but cleaner code :)☆26Updated 9 months ago
- Temporary anonymous version☆22Updated last year
- ☆13Updated 8 months ago
- text to speech☆10Updated last year
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Updated 2 years ago
- Chinese and English Bilinguish G2P☆20Updated last year
- ☆56Updated 2 years ago
- Torchaudio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆11Updated 3 months ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆27Updated 9 months ago
- Finetuning VITS Efficiently☆32Updated last year
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆28Updated last year
- Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++☆16Updated last year
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆15Updated 4 months ago
- ☆24Updated 3 months ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆31Updated 8 months ago
- Phonemes and durations labeling based on whisper small☆11Updated 9 months ago
- ☆18Updated 11 months ago
- ☆33Updated 3 years ago