EMOsuperb / EMO-SUPERB-submissionView external linksLinks
EMO-SUPERB submission
☆50Oct 13, 2025Updated 4 months ago
Alternatives and similar repositories for EMO-SUPERB-submission
Users that are interested in EMO-SUPERB-submission are comparing it to the libraries listed below
Sorting:
- This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.☆47Apr 14, 2025Updated 10 months ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆53Jun 29, 2024Updated last year
- ☆36Sep 6, 2025Updated 5 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15May 16, 2025Updated 9 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- MSP-Podcast Challenge Baseline Code for Interspeech 2025☆28Dec 4, 2024Updated last year
- MSP-Podcast Challenge Baseline Code☆30Jun 12, 2024Updated last year
- Event Relation in Text-to-Audio (TTA) Generation☆20Feb 26, 2025Updated 11 months ago
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆43Mar 3, 2025Updated 11 months ago
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated 10 months ago
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆63Dec 26, 2025Updated last month
- ☆22Jul 30, 2025Updated 6 months ago
- Collection of works for evaluating (and analyzing) large audio-language models (LALMs)☆40Aug 11, 2025Updated 6 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 10 months ago
- Trainging, inference, and testing of the SAC speech codec model.☆96Nov 1, 2025Updated 3 months ago
- ☆54Jul 16, 2025Updated 7 months ago
- Audio Codec Speech processing Universal PERformance Benchmark☆296Jan 8, 2026Updated last month
- [ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…☆60Jul 1, 2024Updated last year
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆59Jun 20, 2024Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 10 months ago
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆20Jan 3, 2023Updated 3 years ago
- ☆22Jan 29, 2026Updated 2 weeks ago
- ☆18Aug 23, 2024Updated last year
- Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).☆28Apr 3, 2024Updated last year
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Jun 14, 2024Updated last year
- Audio Research in US. US-based professors who work on audio (music, speech, acoustics). For students who would like to apply for RA, PhD,…☆27Nov 13, 2025Updated 3 months ago
- [ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…☆1,056Dec 23, 2024Updated last year
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated 10 months ago
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Jun 5, 2023Updated 2 years ago
- ☆34Jun 9, 2025Updated 8 months ago
- This repository contains the training code from paper "SpidR Learning Fast and Stable Linguistic Units for Spoken Language Models Without…☆47Feb 4, 2026Updated last week
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 6 months ago
- A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation☆142Nov 30, 2025Updated 2 months ago
- Code for DeSTA2.5-Audio, general-purpose LALM☆128Feb 4, 2026Updated last week
- ☆17Jul 22, 2024Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 6 months ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆42Sep 5, 2025Updated 5 months ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆295Jun 17, 2025Updated 8 months ago
- Comprehensive quantitative comparison of lossless and lossy audio codecs☆39Feb 11, 2023Updated 3 years ago