Mason-Alberta Phonetic Segmenter
☆15Updated this week
Alternatives and similar repositories for MAPS
Users that are interested in MAPS are comparing it to the libraries listed below
Sorting:
- ☆13Oct 25, 2024Updated last year
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated last month
- Transfer learning approach to pronunciation scoring☆11Jan 17, 2024Updated 2 years ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆35May 7, 2025Updated 9 months ago
- A Weakly Supervised Forced Alignment for disluent speech☆15Nov 12, 2023Updated 2 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆31Aug 30, 2025Updated 6 months ago
- Megatts2 use HierSpeechpp's vocoder☆18Dec 2, 2024Updated last year
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- ☆37Jun 28, 2021Updated 4 years ago
- Python script to convert .srt subtitle files to Praat .textgrid files☆17Jul 10, 2024Updated last year
- Neural network-based forced alignment with bidirectional attention mechanism☆78Jan 17, 2025Updated last year
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 2 years ago
- ☆20Apr 18, 2024Updated last year
- Visual Speech Recongnition☆19Dec 24, 2024Updated last year
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Mar 6, 2023Updated 2 years ago
- ☆19Jun 28, 2022Updated 3 years ago
- Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion☆20Jul 9, 2019Updated 6 years ago
- ☆22Jun 30, 2021Updated 4 years ago
- ☆32Aug 22, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- A toolkit dedicate for speech evaluation.☆23Sep 26, 2024Updated last year
- A collection of all our phonemeizers for dataset construction and inference☆27Feb 21, 2025Updated last year
- Implementation of the subscale framework from the WaveRNN paper, building on top of Fatchord's WaveRNN repo☆19Oct 8, 2020Updated 5 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Recurrent Neural Aligner☆51Apr 14, 2020Updated 5 years ago
- Extract phoneme-level timestamps from speeh audio.☆117Updated this week
- SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech☆27May 25, 2023Updated 2 years ago
- Trained deep neural-net models for estimating articulatory keypoints from midsagittal ultrasound tongue videos and front-view lip camera …☆24Jun 13, 2023Updated 2 years ago
- Decoders from Kaldi using OpenFst☆34Jan 29, 2026Updated last month
- ☆25Jun 14, 2022Updated 3 years ago
- Compendium for the paper "Transparent pronunciation scoring using articulatorily weighted phoneme edit distance" by Karhila, Smolander, Y…☆25May 6, 2019Updated 6 years ago
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- Official implementation of MelHuBERT☆68Feb 21, 2026Updated last week
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Jun 14, 2024Updated last year
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Aug 2, 2021Updated 4 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆26Sep 23, 2020Updated 5 years ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆51Sep 20, 2025Updated 5 months ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆34Aug 27, 2023Updated 2 years ago