[SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition
☆18Dec 1, 2024Updated last year
Alternatives and similar repositories for madeon-asr
Users that are interested in madeon-asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 26, 2024Updated last year
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated last year
- ☆16Nov 9, 2023Updated 2 years ago
- ConMamba for Automatic Speech Recognition☆103Aug 12, 2024Updated last year
- ☆11Oct 20, 2022Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- ☆14Oct 10, 2024Updated last year
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 3 years ago
- Lightweight Speech Representation Learning for One-Shot Voice Conversion☆24Dec 12, 2024Updated last year
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- ☆16Dec 23, 2021Updated 4 years ago
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆44Mar 15, 2024Updated 2 years ago
- This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 20…☆49Dec 25, 2024Updated last year
- A python implementation of Speech intelligibility in bits (SIIB)☆25Apr 4, 2022Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'☆15Jan 20, 2025Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Oct 2, 2024Updated last year
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆14Feb 13, 2022Updated 4 years ago
- Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.☆37Jun 3, 2025Updated 9 months ago
- ☆15Aug 25, 2022Updated 3 years ago
- Official code for AL-PINNS: Augmented Lagrangian relaxation method for Physics-Informed Neural Networks☆12Jul 29, 2023Updated 2 years ago
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Soniox Compare. Compare real-time voice AI side by side. No glossy charts, just results.☆23Jul 15, 2025Updated 8 months ago
- ☆57Dec 19, 2022Updated 3 years ago
- ☆17Oct 18, 2023Updated 2 years ago
- ☆11Mar 22, 2023Updated 3 years ago
- Implementation for paper: Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement☆22Sep 21, 2021Updated 4 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Oct 8, 2021Updated 4 years ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Oct 15, 2024Updated last year
- ☆18Mar 13, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- (Interspeech 2025, official code) Speech enhancement based on cascaded two flows☆16Sep 1, 2025Updated 6 months ago
- Acoustic Neighbor Embeddings☆28Jul 13, 2025Updated 8 months ago
- ☆18Aug 23, 2024Updated last year
- ☆10Nov 16, 2024Updated last year
- Leveraging Local and Global Patterns for Self-Attention Networks☆12Jun 3, 2019Updated 6 years ago
- Automatic gain control library☆15Jul 13, 2024Updated last year
- Microservice that generates subtitles for TUM-Live☆18Mar 19, 2026Updated last week