MontrealCorpusTools/mfa-models

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MontrealCorpusTools/mfa-models)

MontrealCorpusTools / mfa-models

Collection of pretrained models for the Montreal Forced Aligner

☆199

Alternatives and similar repositories for mfa-models

Users that are interested in mfa-models are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MontrealCorpusTools / Montreal-Forced-Aligner
View on GitHub
Command line utility for forced alignment using Kaldi
☆1,851Jul 11, 2026Updated last week
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
li1jkdaw / LPCNet_parallel
View on GitHub
Simulation of parallel synthesis with LPCNet vocoder
☆14May 5, 2020Updated 6 years ago
lingjzhu / charsiu
View on GitHub
Charsiu: A neural phonetic aligner.
☆346Sep 19, 2022Updated 3 years ago
Audio-Foundation-Models / ConversationTTS
View on GitHub
☆101Jan 19, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
thuhcsi / FlatTN
View on GitHub
Chinese Text Normalization and Dataset
☆91May 14, 2022Updated 4 years ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆61Nov 1, 2024Updated last year
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
google-deepmind / librispeech-long
View on GitHub
LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …
☆98Dec 28, 2024Updated last year
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆70Nov 1, 2024Updated last year
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xinjli / transphone
View on GitHub
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
☆174Jun 9, 2023Updated 3 years ago
fabianoluzbr / neural-g2p-portuguese
View on GitHub
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly es…
☆19Jun 14, 2021Updated 5 years ago
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
zhenye234 / CoMoSpeech
View on GitHub
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
☆214Apr 26, 2024Updated 2 years ago
MontrealCorpusTools / MFA-reorganization-scripts
View on GitHub
Collection of scripts and utilities for reorganizing corpora to use with the Montreal Forced Aligner
☆43Jun 22, 2021Updated 5 years ago
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆161Jun 13, 2024Updated 2 years ago
thuhcsi / SpanPSP
View on GitHub
☆76Apr 26, 2022Updated 4 years ago
Stylish-TTS / stylish-tts
View on GitHub
High quality text-to-speech based on StyleTTS 2.
☆78Apr 6, 2026Updated 3 months ago
thuhcsi / SpeechCraft
View on GitHub
The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.
☆197Feb 28, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆252Apr 20, 2024Updated 2 years ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year
pengzhendong / streaming-tts-webui
View on GitHub
Streaming Text to Speech Web UI
☆22May 6, 2024Updated 2 years ago
kakaobrain / g2pm
View on GitHub
A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset
☆367Dec 24, 2021Updated 4 years ago
KevinMIN95 / StyleSpeech
View on GitHub
Official implementation of Meta-StyleSpeech and StyleSpeech
☆253Feb 9, 2022Updated 4 years ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
lmaxwell / McHuo
View on GitHub
A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes
☆12Oct 19, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
haidog-yaqub / DiffPitcher
View on GitHub
Diffusion-based singing voice pitch correction
☆144Sep 20, 2024Updated last year
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
ryota-komatsu / speech_resynth
View on GitHub
Speech Resynthesis and Language Modeling
☆27Jun 11, 2025Updated last year
AlexandaJerry / SingingVoice-MFA-Training
View on GitHub
MFA acoustic model training based on Opencpop
☆15Sep 23, 2022Updated 3 years ago
auspicious3000 / ProsodyLM
View on GitHub
ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models
☆46Nov 18, 2025Updated 8 months ago
keonlee9420 / PortaSpeech
View on GitHub
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
☆342Feb 17, 2022Updated 4 years ago
yl4579 / PL-BERT
View on GitHub
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
☆269Jan 13, 2025Updated last year