[SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition
☆19Dec 1, 2024Updated last year
Alternatives and similar repositories for madeon-asr
Users that are interested in madeon-asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Nov 26, 2024Updated last year
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆13Mar 15, 2025Updated last year
- ☆16Nov 9, 2023Updated 2 years ago
- ConMamba for Automatic Speech Recognition☆105Aug 12, 2024Updated last year
- ☆11Oct 20, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆15Oct 10, 2024Updated last year
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- Lightweight Speech Representation Learning for One-Shot Voice Conversion☆23Dec 12, 2024Updated last year
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- ☆16Dec 23, 2021Updated 4 years ago
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆44Mar 15, 2024Updated 2 years ago
- This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 20…☆49Dec 25, 2024Updated last year
- Differentiable implementation of MSBG hearing loss model and MBSTOI intelligibility metric for Clarity Enhancement challenge.☆21Nov 19, 2021Updated 4 years ago
- A python implementation of Speech intelligibility in bits (SIIB)☆25Apr 4, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'☆15Jan 20, 2025Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Oct 2, 2024Updated last year
- Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.☆37Jun 3, 2025Updated last year
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆14Feb 13, 2022Updated 4 years ago
- ☆15Aug 25, 2022Updated 3 years ago
- Official code for AL-PINNS: Augmented Lagrangian relaxation method for Physics-Informed Neural Networks☆12Jul 29, 2023Updated 2 years ago
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆23Dec 21, 2024Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆53May 1, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Soniox Compare. Compare real-time voice AI side by side. No glossy charts, just results.☆23Jul 15, 2025Updated 11 months ago
- ☆17Oct 18, 2023Updated 2 years ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Oct 15, 2024Updated last year
- ☆57Dec 19, 2022Updated 3 years ago
- ☆11Mar 22, 2023Updated 3 years ago
- Implementation for paper: Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement☆22Sep 21, 2021Updated 4 years ago
- ☆12Jun 10, 2021Updated 5 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆52Oct 8, 2021Updated 4 years ago
- ☆18Mar 13, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- (Interspeech 2025, official code) Speech enhancement based on cascaded two flows☆16Sep 1, 2025Updated 9 months ago
- ☆19Aug 23, 2024Updated last year
- ☆10Nov 16, 2024Updated last year
- Leveraging Local and Global Patterns for Self-Attention Networks☆12Jun 3, 2019Updated 7 years ago
- Acoustic Neighbor Embeddings☆31Jul 13, 2025Updated 11 months ago
- Automatic gain control library☆15Jul 13, 2024Updated last year
- Microservice that generates subtitles for TUM-Live☆18Apr 24, 2026Updated last month