lukerbs / forcealignLinks
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆20Updated 10 months ago
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- An unofficial PyTorch implementation of VALL-E☆88Updated 2 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆102Updated last year
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆85Updated last week
- Chinese and English Bilinguish G2P☆21Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆105Updated last year
- Official Code for ParrotTTS☆55Updated last year
- ☆57Updated last year
- Finetuning VITS Efficiently☆33Updated last year
- ☆23Updated 11 months ago
- ☆99Updated 2 weeks ago
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆76Updated 11 months ago
- ☆29Updated 8 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated 2 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆76Updated 11 months ago
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆25Updated last year
- ☆37Updated 6 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆53Updated 4 months ago
- Putting flows on top of neural transducers for better TTS☆64Updated this week
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆105Updated 9 months ago
- All generative model in one for better TTS model☆74Updated last year
- Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆159Updated 2 months ago
- a lightweight voice conversion☆85Updated last year
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago
- audiolm-pytorch training code☆15Updated 2 years ago
- ☆71Updated 2 years ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆18Updated 10 months ago
- Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E☆135Updated 11 months ago
- ☆69Updated 2 years ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆81Updated last year