msplabresearch/MSP-Podcast_Challenge_IS2025

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/msplabresearch/MSP-Podcast_Challenge_IS2025)

msplabresearch / MSP-Podcast_Challenge_IS2025

MSP-Podcast Challenge Baseline Code for Interspeech 2025

☆29

Alternatives and similar repositories for MSP-Podcast_Challenge_IS2025

Users that are interested in MSP-Podcast_Challenge_IS2025 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

msplabresearch / MSP-Podcast_Challenge
View on GitHub
MSP-Podcast Challenge Baseline Code
☆31Jun 12, 2024Updated 2 years ago
ag027592 / EMO-SUPERB
View on GitHub
EMO-SUPERB: a reproducible speech emotion recognition benchmark with leakage-free splits for 6 datasets and 15 speech SSL models (IEEE SL…
☆51Updated this week
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
zxzhao0 / C2SER
View on GitHub
We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…
☆49Mar 3, 2025Updated last year
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 7 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
usc-sail / peft-ser
View on GitHub
[ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…
☆60Jul 1, 2024Updated 2 years ago
JunyiPeng00 / SLT22_MultiHead-Factorized-Attentive-Pooling
View on GitHub
An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
☆24Sep 22, 2024Updated last year
yuhanghe01 / RiTTA
View on GitHub
Event Relation in Text-to-Audio (TTA) Generation
☆21Feb 26, 2025Updated last year
Choddeok / Affectron
View on GitHub
[ACL 2026 Findings] Affectron: Emotional Speech Synthesis with Affective and Contextually Aligned Nonverbal Vocalizations
☆20Jul 16, 2026Updated last week
speechwellness / SpeechWellness-1_Baseline
View on GitHub
☆11Feb 14, 2025Updated last year
Lhx94As / PHO-LID
View on GitHub
PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification
☆21Aug 24, 2023Updated 2 years ago
jishengpeng / WavReward
View on GitHub
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators
☆56May 15, 2025Updated last year
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ZehuaKcrissLi / GTR-Voice
View on GitHub
☆16Nov 11, 2024Updated last year
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
yangdongchao / Tim-TSENet
View on GitHub
The source code of Tim-TSENet
☆15Apr 22, 2022Updated 4 years ago
slSeanWU / beats-conformer-bart-audio-captioner
View on GitHub
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…
☆41Jan 6, 2024Updated 2 years ago
ECNU-Cross-Innovation-Lab / ENT
View on GitHub
[ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
☆28Apr 11, 2024Updated 2 years ago
praveena2j / RecurrentJointAttentionwithLSTMs
View on GitHub
ICASSP 2023: "Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition"
☆14Nov 29, 2024Updated last year
msalhab96 / MultiSpeech
View on GitHub
pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper
☆21Jun 23, 2022Updated 4 years ago
suhitaghosh10 / emo-stargan
View on GitHub
Implementation of Emo-StarGAN
☆48Dec 19, 2023Updated 2 years ago
bnojavan / EmoReact
View on GitHub
☆14May 25, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
SiavashShams / ssamba
View on GitHub
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
☆140Nov 5, 2025Updated 8 months ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
Exploration-Lab / Shapes-of-Emotion
View on GitHub
☆15Mar 29, 2023Updated 3 years ago
ftshijt / Interspeech2024_DiscreteSpeechChallenge
View on GitHub
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Jan 26, 2024Updated 2 years ago
AmphionTeam / TARS
View on GitHub
[ACL 2026] Closing the Modality Reasoning Gap for Speech Large Language Models
☆18Apr 17, 2026Updated 3 months ago
ICDM-UESTC / DOSE
View on GitHub
DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023
☆60May 16, 2025Updated last year
hspark84 / lgtfb-en
View on GitHub
Learnable Gammatone Filterbank (LGTFB) and Equal-loudness Normalization (EN)
☆13Apr 24, 2020Updated 6 years ago
ajd12342 / paraspeechclap
View on GitHub
Codebase for 'ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining'
☆23Jun 20, 2026Updated last month
sotaque-brasileiro / sotaque-brasileiro
View on GitHub
Uma base de dados para estudo de regionalismos brasileiros através da voz.
☆11May 2, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kaistmm / AdaptVC
View on GitHub
☆17Jun 2, 2025Updated last year
haoxiangsnr / llm-tse
View on GitHub
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
☆43Oct 13, 2023Updated 2 years ago
nhut-ngnn / Multimodal-Speech-Emotion-Recognition
View on GitHub
A multimodal SER project combining BERT and ECAPA-TDNN with cross-attention-based fusion on the IEMOCAP dataset.
☆11Dec 9, 2024Updated last year
xiquan-li / FineLAP
View on GitHub
[ACL 2026 Main] FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pre-training
☆36Apr 20, 2026Updated 3 months ago
tiantiaf0627 / vox-profile-release
View on GitHub
Vox-Profile Benchmark
☆87Feb 16, 2026Updated 5 months ago
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
JHU-LCAP / FlexSED
View on GitHub
open-vocabulary sound event detection
☆53Dec 17, 2025Updated 7 months ago