Tonyyouyou/Mamba-in-Speech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tonyyouyou/Mamba-in-Speech)

Tonyyouyou / Mamba-in-Speech

☆56

Alternatives and similar repositories for Mamba-in-Speech

Users that are interested in Mamba-in-Speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zds-potato / multilingual-phonetic-sv
View on GitHub
☆10Dec 22, 2023Updated 2 years ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆11Sep 30, 2024Updated last year
Miamoto / Conformer-NTM
View on GitHub
☆16Nov 9, 2023Updated 2 years ago
xi-j / Mamba-ASR
View on GitHub
ConMamba for Automatic Speech Recognition
☆106Aug 12, 2024Updated last year
xi-j / Mamba-TasNet
View on GitHub
☆116Oct 1, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Audio-WestlakeU / UMA-ASR
View on GitHub
This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).
☆35Dec 17, 2024Updated last year
tsun / APA
View on GitHub
Domain Adaptation with Adversarial Training on Penultimate Activations (AAAI 2023)
☆11Aug 1, 2023Updated 2 years ago
Takaaki-Saeki / ssl_speech_restoration_v2
View on GitHub
☆17Dec 18, 2023Updated 2 years ago
johnmartinsson / differentiable-mel-spectrogram
View on GitHub
The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …
☆23Dec 21, 2024Updated last year
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
mmmmayi / ExPO
View on GitHub
official implementation of paper ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification
☆14Mar 14, 2025Updated last year
yao-papercodes / AGLRLS
View on GitHub
Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition (TMM 2024)
☆17Aug 13, 2024Updated last year
YUCHEN005 / Gradient-Remedy
View on GitHub
Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"
☆21May 24, 2023Updated 3 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆26Jun 6, 2026Updated last month
KawhiZhao / Egocentric-Audio-Visual-Speaker-Localization
View on GitHub
Code for paper Audio Visual Speaker Localization from EgoCentric Views
☆11Jul 3, 2024Updated 2 years ago
tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0
View on GitHub
☆18Mar 13, 2024Updated 2 years ago
jhwanflow / Fastspeech2-Korean
View on GitHub
☆13May 14, 2021Updated 5 years ago
muuda / MFF-EINV2
View on GitHub
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection
☆22Jul 17, 2024Updated last year
tsun / LADA
View on GitHub
Local Context-Aware Active Domain Adaptation (ICCV 2023)
☆22Oct 18, 2023Updated 2 years ago
seorim0 / SE-using-SRL-Model
View on GitHub
Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings
☆20Jun 6, 2025Updated last year
upskyy / Squeezeformer
View on GitHub
PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)
☆148Nov 22, 2022Updated 3 years ago
nttcslab-sp / mamba-diarization
View on GitHub
Official repository for Mamba-based Segmentation Model for Speaker Diarization
☆47May 13, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated last year
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
yanghaha0908 / FastHuBERT
View on GitHub
Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
☆100Nov 20, 2024Updated last year
NikolaiKyhne / MambAttention
View on GitHub
Official repository for the paper "MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement" (A…
☆33Mar 25, 2026Updated 3 months ago
5Hyeons / StyleTTS2-Vocos
View on GitHub
StyleTTS2 + Vocos as a Decoder
☆13Mar 24, 2025Updated last year
ighodgao / mamba-speech-synthesis
View on GitHub
Jupyter Notebook running Mamba speech synthesis example on Determined AI. Based on https://2084.substack.com/p/2084-marcrandbot-speech-sy…
☆23Feb 8, 2024Updated 2 years ago
royal-dargon / MDSE
View on GitHub
☆23Sep 10, 2024Updated last year
THUsatlab / BERT-LID
View on GitHub
Leveraging BERT to Improve Spoken Language Identification
☆17Nov 22, 2022Updated 3 years ago
Liu-Tianchi / Golden-Gemini-for-Speaker-Verification
View on GitHub
Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'
☆15Jan 20, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
bubaimaji / cmt-mser
View on GitHub
"MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23
☆24Feb 26, 2023Updated 3 years ago
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
gitmylo / FlowNodes
View on GitHub
Flow control nodes for comfyUI, allowing for more diverse workflows
☆13Apr 3, 2025Updated last year
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year