☆54Jul 1, 2024Updated last year
Alternatives and similar repositories for Mamba-in-Speech
Users that are interested in Mamba-in-Speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Dec 22, 2023Updated 2 years ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆11Sep 30, 2024Updated last year
- ☆16Nov 9, 2023Updated 2 years ago
- ConMamba for Automatic Speech Recognition☆103Aug 12, 2024Updated last year
- ☆113Oct 1, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).☆35Dec 17, 2024Updated last year
- Domain Adaptation with Adversarial Training on Penultimate Activations (AAAI 2023)☆11Aug 1, 2023Updated 2 years ago
- ☆16Dec 18, 2023Updated 2 years ago
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- Target speaker automatic speech recognition (TS-ASR)☆12Oct 14, 2023Updated 2 years ago
- Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition (TMM 2024)☆16Aug 13, 2024Updated last year
- official implementation of paper ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification☆14Mar 14, 2025Updated last year
- Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"☆20May 24, 2023Updated 2 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated last week
- Code for paper Audio Visual Speaker Localization from EgoCentric Views☆11Jul 3, 2024Updated last year
- ☆18Mar 13, 2024Updated 2 years ago
- ☆13May 14, 2021Updated 4 years ago
- Local Context-Aware Active Domain Adaptation (ICCV 2023)☆21Oct 18, 2023Updated 2 years ago
- MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection☆21Jul 17, 2024Updated last year
- Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings☆19Jun 6, 2025Updated 9 months ago
- PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)☆148Nov 22, 2022Updated 3 years ago
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆47May 13, 2025Updated 10 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…☆18Jul 23, 2024Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 7 months ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆97Nov 20, 2024Updated last year
- Jupyter Notebook running Mamba speech synthesis example on Determined AI. Based on https://2084.substack.com/p/2084-marcrandbot-speech-sy…☆23Feb 8, 2024Updated 2 years ago
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated last year
- ☆22Sep 10, 2024Updated last year
- Leveraging BERT to Improve Spoken Language Identification☆17Nov 22, 2022Updated 3 years ago
- "MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23☆23Feb 26, 2023Updated 3 years ago
- Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'☆15Jan 20, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆11Jun 14, 2024Updated last year
- A neural speech codec based on discrete WavLM representations☆26Aug 28, 2024Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- Flow control nodes for comfyUI, allowing for more diverse workflows☆13Apr 3, 2025Updated 11 months ago
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆34Oct 15, 2025Updated 5 months ago