EMO-SUPERB submission
☆51Oct 13, 2025Updated 6 months ago
Alternatives and similar repositories for EMO-SUPERB-submission
Users that are interested in EMO-SUPERB-submission are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- MSP-Podcast Challenge Baseline Code for Interspeech 2025☆28Dec 4, 2024Updated last year
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆54Apr 10, 2026Updated 3 weeks ago
- This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.☆54Apr 14, 2025Updated last year
- MSP-Podcast Challenge Baseline Code☆31Jun 12, 2024Updated last year
- ☆36Sep 6, 2025Updated 7 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆65Dec 26, 2025Updated 4 months ago
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated last year
- Event Relation in Text-to-Audio (TTA) Generation☆21Feb 26, 2025Updated last year
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆47Mar 3, 2025Updated last year
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated 11 months ago
- Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).☆29Apr 3, 2024Updated 2 years ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆59Jun 20, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated last year
- [ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.☆101Nov 1, 2025Updated 5 months ago
- Audio Codec Speech processing Universal PERformance Benchmark☆301Apr 1, 2026Updated last month
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆45Sep 5, 2025Updated 7 months ago
- [ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…☆1,106Dec 23, 2024Updated last year
- Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)☆95Dec 3, 2024Updated last year
- Code for DeSTA2.5-Audio, general-purpose LALM☆139Feb 4, 2026Updated 2 months ago
- ☆37Jun 9, 2025Updated 10 months ago
- ☆23Jan 29, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- AuxFormer: Robust Approach to Audiovisual Emotion Recognition☆14Mar 14, 2023Updated 3 years ago
- Audio Research in US. US-based professors who work on audio (music, speech, acoustics). For students who would like to apply for RA, PhD,…☆27Feb 27, 2026Updated 2 months ago
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆20Jan 3, 2023Updated 3 years ago
- Collection of works for evaluating (and analyzing) large audio-language models (LALMs)☆40Aug 11, 2025Updated 8 months ago
- [ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…☆59Jul 1, 2024Updated last year
- ☆47Jul 7, 2025Updated 9 months ago
- ☆19Aug 23, 2024Updated last year
- ☆54Jul 16, 2025Updated 9 months ago
- ☆25Jul 30, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆56May 15, 2025Updated 11 months ago
- A collection of datasets for the purpose of emotion recognition/detection in speech.☆413Sep 30, 2024Updated last year
- AudioBench: A Universal Benchmark for Audio Large Language Models☆303Jun 17, 2025Updated 10 months ago
- A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation☆163Nov 30, 2025Updated 5 months ago
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Jun 14, 2024Updated last year
- ☆12Nov 25, 2023Updated 2 years ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆26Feb 11, 2026Updated 2 months ago