[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
☆134Nov 5, 2025Updated 4 months ago
Alternatives and similar repositories for ssamba
Users that are interested in ssamba are comparing it to the libraries listed below
Sorting:
- ☆109Oct 1, 2024Updated last year
- ☆207Dec 5, 2024Updated last year
- ConMamba for Automatic Speech Recognition☆103Aug 12, 2024Updated last year
- This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)☆251Dec 12, 2025Updated 2 months ago
- Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"☆167Nov 24, 2024Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆101Jul 24, 2024Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- ☆12Mar 11, 2025Updated 11 months ago
- ☆33Dec 23, 2025Updated 2 months ago
- Official code of ElasticAST (Interspeech 2024 paper)☆34Jul 30, 2024Updated last year
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆81Jun 7, 2024Updated last year
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated 11 months ago
- Audio Codec Speech processing Universal PERformance Benchmark☆297Jan 8, 2026Updated 2 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Nov 1, 2024Updated last year
- MSP-Podcast Challenge Baseline Code for Interspeech 2025☆28Dec 4, 2024Updated last year
- ☆67Aug 16, 2023Updated 2 years ago
- ☆70Jan 25, 2025Updated last year
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 7 months ago
- ☆49Apr 1, 2025Updated 11 months ago
- A collection of audio signals accompanied by corresponding subjective scores of perceived quality. Everything under permissive licenses.☆47Feb 24, 2026Updated last week
- Official code release for "RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation", accepted ICLR 2024☆49Oct 14, 2025Updated 4 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆114Jan 28, 2026Updated last month
- Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement☆472May 19, 2025Updated 9 months ago
- Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"☆41Aug 14, 2025Updated 6 months ago
- ☆21Jul 15, 2024Updated last year
- Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…☆244Jul 31, 2024Updated last year
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- Implementation of the paper: "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning" in pytorch☆14Mar 2, 2026Updated last week
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆69Aug 13, 2024Updated last year
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆29Jul 9, 2024Updated last year
- ☆24Sep 10, 2025Updated 5 months ago
- Official repository of SepReformer for speech separation☆246Jan 13, 2025Updated last year
- Masked Modeling Duo: Towards a Universal Audio Pre-training Framework☆138Feb 23, 2026Updated 2 weeks ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 11 months ago
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆90Apr 2, 2024Updated last year
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Oct 2, 2024Updated last year