☆11Feb 14, 2025Updated last year
Alternatives and similar repositories for SpeechWellness-1_Baseline
Users that are interested in SpeechWellness-1_Baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Official repository for the paper "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs"☆21Sep 7, 2025Updated 7 months ago
- MSP-Podcast Challenge Baseline Code for Interspeech 2025☆28Dec 4, 2024Updated last year
- ☆81Mar 23, 2026Updated 3 weeks ago
- The baselines of ARC-Challenge-Interspeech2026☆58Dec 1, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆55May 15, 2025Updated 11 months ago
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Nov 25, 2025Updated 4 months ago
- “莱斯杯”全国第一届“军事智能·机器阅读”挑战赛[初赛top2版]☆11Jan 15, 2019Updated 7 years ago
- ☆19Mar 2, 2024Updated 2 years ago
- ☆23Jan 29, 2026Updated 2 months ago
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆47Mar 3, 2025Updated last year
- MMER☆17Jan 8, 2026Updated 3 months ago
- [ICASSP2024] Code for paper "SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection"☆15Jul 6, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- source code for "Towards Speaker-Unknown Emotion Recognition in Conversation via Progressive Contrastive Deep Supervision"☆11Nov 22, 2024Updated last year
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆36Oct 15, 2025Updated 6 months ago
- ☆18Aug 16, 2025Updated 8 months ago
- text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)☆12Oct 15, 2024Updated last year
- Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"☆13Oct 31, 2024Updated last year
- Official data preparation scripts for the URGENT 2024 Challenge☆87May 21, 2025Updated 10 months ago
- ☆18Aug 23, 2024Updated last year
- [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆39Dec 24, 2025Updated 3 months ago
- EmoLLM: Multimodal Emotional Understanding Meets Large Language Models☆19Jun 24, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆17Mar 21, 2024Updated 2 years ago
- arxiv翻译修复器!☆22Nov 13, 2024Updated last year
- ☆11Dec 6, 2024Updated last year
- Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny☆15Oct 30, 2025Updated 5 months ago
- ☆48Apr 5, 2026Updated 2 weeks ago
- ☆36Jun 16, 2023Updated 2 years ago
- This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets re…☆13Oct 8, 2025Updated 6 months ago
- ☆11Oct 20, 2022Updated 3 years ago
- uyghur text resource crawled from website☆12Dec 25, 2015Updated 10 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆35Sep 24, 2024Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 6 months ago
- Training Transformers with knowledge localization (SGTM)☆51Jan 11, 2026Updated 3 months ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated 2 years ago
- Target speaker automatic speech recognition (TS-ASR)☆13Oct 14, 2023Updated 2 years ago
- kaldi cnn-tdnnf baseline☆13Aug 31, 2021Updated 4 years ago