xi-j / Mamba-ASR
ConMamba for Automatic Speech Recognition
☆54Updated 5 months ago
Alternatives and similar repositories for Mamba-ASR:
Users that are interested in Mamba-ASR are comparing it to the libraries listed below
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆39Updated last year
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆68Updated last year
- ☆43Updated last year
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆52Updated last month
- ☆48Updated 2 months ago
- Clustering-based methods for overlapping diarization☆74Updated last year
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆84Updated 2 months ago
- A list of papers for child ASR☆35Updated 3 months ago
- ☆51Updated last year
- ☆64Updated last year
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆48Updated 7 months ago
- TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings☆22Updated 4 months ago
- ☆56Updated 3 months ago
- SLT 2024 Challenge: Post-ASR-Speaker-Tagging☆14Updated 7 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆77Updated last month
- The open source code for SimpleSpeech series☆122Updated 3 months ago
- [AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS☆64Updated 2 months ago
- ☆30Updated last year
- ☆64Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Updated 9 months ago
- Official repo of ICASSP 2024 paper - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.☆48Updated 3 weeks ago
- Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.☆45Updated last week
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆27Updated last year
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated last year
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆73Updated 7 months ago
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆67Updated last month
- ☆29Updated 2 months ago
- A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"☆54Updated 4 months ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆32Updated 5 months ago