lukerbs / forcealignLinks
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆17Updated 7 months ago
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- ☆56Updated 2 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated last year
- Official Code for ParrotTTS☆52Updated 9 months ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆51Updated 4 years ago
- A collection of all our phonemeizers for dataset construction and inference☆24Updated 4 months ago
- GPT for FACodec☆13Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- ☆21Updated 8 months ago
- ☆29Updated last year
- ☆57Updated last year
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Updated 2 years ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆28Updated 2 months ago
- Finetuning VITS Efficiently☆33Updated last year
- ☆28Updated 5 months ago
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆28Updated last year
- Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"☆36Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆29Updated 2 years ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Updated 2 years ago
- one script for xls-r/xlsr/whisper fine-tuning☆42Updated 2 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Updated 7 months ago
- 'Grad-TTS' with Multilingual Cleaners☆10Updated last year
- ☆22Updated 4 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated 2 years ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆78Updated last week
- Zero-Shot Foreign Accent Conversion without a Native Reference☆33Updated last year
- ☆23Updated 2 years ago
- ☆13Updated 10 months ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 3 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- ☆33Updated 3 years ago