lukerbs / forcealignLinks
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
☆18Updated 9 months ago
Alternatives and similar repositories for forcealign
Users that are interested in forcealign are comparing it to the libraries listed below
Sorting:
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆103Updated 10 months ago
- Official Code for ParrotTTS☆54Updated 10 months ago
- An unofficial PyTorch implementation of VALL-E☆88Updated last month
- ☆57Updated last year
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆83Updated this week
- Chinese and English Bilinguish G2P☆21Updated 2 years ago
- ☆70Updated 2 months ago
- Putting flows on top of neural transducers for better TTS☆63Updated 3 weeks ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆103Updated last year
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆51Updated 4 years ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆78Updated 11 months ago
- StyleTTS 2 Optimized Training Fork☆33Updated 7 months ago
- ☆22Updated 10 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆53Updated 3 months ago
- ☆29Updated 7 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆18Updated 9 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆75Updated 10 months ago
- Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆150Updated last month
- ☆71Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- ☆43Updated 11 months ago
- Just another FastSpeech 2 but cleaner code :)☆27Updated last year
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆52Updated last year
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆24Updated last year
- All generative model in one for better TTS model☆72Updated 11 months ago
- a lightweight voice conversion☆84Updated last year
- The case study and multilingfual performance of ICASSP submission☆24Updated 2 years ago
- ☆41Updated 6 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆173Updated last week
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated 2 years ago