lukerbs / forcealign
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆14Updated 5 months ago
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- ☆20Updated 6 months ago
- text to speech☆10Updated last year
- ☆56Updated 2 years ago
- Just another FastSpeech 2 but cleaner code :)☆26Updated 10 months ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated 8 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆27Updated 9 months ago
- Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"☆19Updated this week
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆19Updated 3 months ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Updated 2 years ago
- GPT for FACodec☆13Updated last year
- Open Source Speech/Text Data on AI☆18Updated 2 years ago
- Text-To-Speech for NotebookLM☆29Updated 4 months ago
- (WIP)long form speech generatoins☆31Updated last month
- Phonemes and durations labeling based on whisper small☆11Updated 10 months ago
- Official Code for ParrotTTS☆50Updated 7 months ago
- noise reduction☆17Updated 10 months ago
- Torchaudio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆11Updated 4 months ago
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆34Updated 6 months ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆27Updated last week
- Decoders from Kaldi using OpenFst☆28Updated 4 months ago
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆28Updated last year
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Updated 11 months ago
- with alignment learning and continuous wavelet transform☆21Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- 'Grad-TTS' with Multilingual Cleaners☆10Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 3 years ago
- ☆33Updated 3 years ago
- ☆18Updated last year
- Forced alignment decoder for Whisper.☆14Updated last year
- video cut powered by AI☆25Updated 2 years ago