☆55Jul 1, 2024Updated last year
Alternatives and similar repositories for Mamba-in-Speech
Users that are interested in Mamba-in-Speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Dec 22, 2023Updated 2 years ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆11Sep 30, 2024Updated last year
- ☆16Nov 9, 2023Updated 2 years ago
- ConMamba for Automatic Speech Recognition☆103Aug 12, 2024Updated last year
- ☆114Oct 1, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).☆35Dec 17, 2024Updated last year
- Domain Adaptation with Adversarial Training on Penultimate Activations (AAAI 2023)☆11Aug 1, 2023Updated 2 years ago
- ☆16Dec 18, 2023Updated 2 years ago
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- Target speaker automatic speech recognition (TS-ASR)☆13Oct 14, 2023Updated 2 years ago
- official implementation of paper ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification☆14Mar 14, 2025Updated last year
- Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition (TMM 2024)☆17Aug 13, 2024Updated last year
- Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"☆20May 24, 2023Updated 2 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated last month
- Code for paper Audio Visual Speaker Localization from EgoCentric Views☆11Jul 3, 2024Updated last year
- ☆18Mar 13, 2024Updated 2 years ago
- ☆13May 14, 2021Updated 4 years ago
- MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection☆21Jul 17, 2024Updated last year
- Local Context-Aware Active Domain Adaptation (ICCV 2023)☆21Oct 18, 2023Updated 2 years ago
- Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings☆19Jun 6, 2025Updated 11 months ago
- PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)☆148Nov 22, 2022Updated 3 years ago
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆47May 13, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…☆18Jul 23, 2024Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 9 months ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆97Nov 20, 2024Updated last year
- Jupyter Notebook running Mamba speech synthesis example on Determined AI. Based on https://2084.substack.com/p/2084-marcrandbot-speech-sy…☆23Feb 8, 2024Updated 2 years ago
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated last year
- ☆22Sep 10, 2024Updated last year
- Leveraging BERT to Improve Spoken Language Identification☆17Nov 22, 2022Updated 3 years ago
- Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'☆15Jan 20, 2025Updated last year
- "MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23☆23Feb 26, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆12Jun 14, 2024Updated last year
- A neural speech codec based on discrete WavLM representations☆26Aug 28, 2024Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- Official code for MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enh…☆54Mar 5, 2025Updated last year
- Demo for DART, Audio Imagination workshop submission in NeurIPS 2024☆13Apr 22, 2026Updated 2 weeks ago