The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George Saon, Brian Kingsbury. Interspeech 2024).
☆20Oct 11, 2024Updated last year
Alternatives and similar repositories for mel-asr
Users that are interested in mel-asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- ☆46Apr 16, 2023Updated 2 years ago
- ☆19Mar 22, 2024Updated 2 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆41Aug 29, 2024Updated last year
- ☆13Sep 25, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆12Mar 14, 2025Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆32Mar 14, 2025Updated last year
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- Yeast, a lite and light beamer theme☆18Dec 6, 2020Updated 5 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 8 months ago
- ☆39Apr 15, 2024Updated last year
- ☆25Mar 12, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Implementation of the contextual biasing for ASR decoding on GPUs without lattice generation. The code supports submission to Interspeech…☆21Sep 25, 2023Updated 2 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- ☆38Jul 4, 2024Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- ☆36Mar 14, 2025Updated last year
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统☆48Jun 27, 2018Updated 7 years ago
- A Weakly Supervised Forced Alignment for disluent speech☆15Nov 12, 2023Updated 2 years ago
- ☆15Jul 4, 2024Updated last year
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Official code of ElasticAST (Interspeech 2024 paper)☆34Jul 30, 2024Updated last year
- A JAX library for building lattice-based speech transducer models☆47Mar 2, 2026Updated 3 weeks ago
- Code for DeSTA2.5-Audio, general-purpose LALM☆131Feb 4, 2026Updated last month
- ☆55Jul 16, 2025Updated 8 months ago
- ☆44Sep 19, 2024Updated last year
- ☆12Nov 7, 2024Updated last year
- A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permuta…☆11Aug 8, 2020Updated 5 years ago