OlaWod / mfaLinks

About how to use 'Montreal Forced Aligner'.

☆8

Alternatives and similar repositories for mfa

Users that are interested in mfa are comparing it to the libraries listed below

Sorting:

Zeqiang-Lai / Prosody_Prediction
Predict prosody labels for Chinese sentences.
☆41Updated 3 years ago
huangzj421 / Aishell1Mix
This is a mandarin version of speech separation dataset like WSJMix and LibriMix
☆12Updated 2 years ago
thuhcsi / FlatTN
Chinese Text Normalization and Dataset
☆84Updated 3 years ago
papercup-open-source / phonological-features
Materials accompanying the paper "Phonological features for 0-shot multilingual speech synthesis"
☆33Updated 4 years ago
Yablon / auorange
Audio LPC (linear prediction code) using mel spectorgram, compatible for LPCNet
☆62Updated 4 years ago
Liu-Feng-deeplearning / TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
☆53Updated 5 years ago
shaojinding / Adversarial-Many-to-Many-VC
[InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by …
☆39Updated 2 years ago
athena-team / DiDiSpeech
☆44Updated 4 years ago
Mddct / cosyvoice2-flow-optimized
faster inference
☆28Updated 6 months ago
TomJwYu / WenetSpeechSpeakerCluster
☆56Updated 2 years ago
lifeiteng / TTS-TextAnalyzer
TTS Text Analyzer
☆32Updated 2 years ago
sigmeta / g2p-kd
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion
☆20Updated 6 years ago
Riroaki / Chinese-Rhythm-Predictor
基于随机森林和条件随机场的中文韵律预测模型
☆28Updated 11 months ago
snsun / kaldi-decoder-code-reading
☆32Updated 2 years ago
thuhcsi / SpanPSP
☆76Updated 3 years ago
nwpuaslp / TTS_Course
☆69Updated 4 years ago
Liangzheng-ZL / BEdit-TTS
Speech samples and code of BEdit-TTS
☆33Updated last year
thuhcsi / NeuFA
Neural network-based forced alignment with bidirectional attention mechanism
☆77Updated 6 months ago
wenet-e2e / wesignal
Production first, nn-based on-device signal processing toolkit.
☆65Updated 2 years ago
HaoranMiao / streaming-attention
streaming attention networks for end-to-end automatic speech recognition
☆55Updated 5 years ago
Kevin-naticl / LLaSE
LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement
☆16Updated last week
Jackiexiao / tts-frontend-dataset
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
☆99Updated last year
AI-Unicamp / TTS-Objective-Metrics
Objective metrics used in several text-to-speech (TTS) papers.
☆49Updated last month
roedoejet / FastSpeech2_ACL2022_reproducibility
☆22Updated last year
R1ckShi / SeACo-Paraformer
[ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.
☆31Updated last year
keonlee9420 / Stepwise_Monotonic_Multihead_Attention
PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer …
☆36Updated 4 years ago
ZiangLong / LPCNet_pytorch
A Pytorch version of LPCNet, including dump weight
☆35Updated 3 years ago
XierHacker / Model_Fusion_Based_Prosody_Prediction
Model Fusion Based Prosody Prediction
☆17Updated 7 years ago
k2-fsa / kaldifst
Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files
☆53Updated 2 months ago
BakerBunker / SALT
[ASRU 2023] Code of paper SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation
☆19Updated 11 months ago