☆11Feb 14, 2025Updated last year
Alternatives and similar repositories for SpeechWellness-1_Baseline
Users that are interested in SpeechWellness-1_Baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Official repository for the paper "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs"☆21Sep 7, 2025Updated 8 months ago
- MSP-Podcast Challenge Baseline Code for Interspeech 2025☆28Dec 4, 2024Updated last year
- The baselines of ARC-Challenge-Interspeech2026☆59Dec 1, 2025Updated 5 months ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆56May 15, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- ☆84Mar 23, 2026Updated last month
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Nov 25, 2025Updated 5 months ago
- “莱斯杯”全国第一届“军事智能·机器阅读”挑战赛[初赛top2版]☆11Jan 15, 2019Updated 7 years ago
- ☆19Mar 2, 2024Updated 2 years ago
- ☆23Jan 29, 2026Updated 3 months ago
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆47Mar 3, 2025Updated last year
- MMER☆18Jan 8, 2026Updated 4 months ago
- [ICASSP2024] Code for paper "SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection"☆15Jul 6, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- source code for "Towards Speaker-Unknown Emotion Recognition in Conversation via Progressive Contrastive Deep Supervision"☆11Nov 22, 2024Updated last year
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆38Oct 15, 2025Updated 6 months ago
- ☆17Apr 16, 2026Updated 3 weeks ago
- text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)☆12Oct 15, 2024Updated last year
- Official data preparation scripts for the URGENT 2024 Challenge☆88May 21, 2025Updated 11 months ago
- Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"☆13Oct 31, 2024Updated last year
- ☆19Aug 23, 2024Updated last year
- [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆39Dec 24, 2025Updated 4 months ago
- EmoLLM: Multimodal Emotional Understanding Meets Large Language Models☆19Jun 24, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆18Mar 21, 2024Updated 2 years ago
- ☆11Dec 6, 2024Updated last year
- Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny☆15Apr 15, 2026Updated 3 weeks ago
- arxiv翻译修复器!☆22Nov 13, 2024Updated last year
- ☆48Apr 5, 2026Updated last month
- ☆36Jun 16, 2023Updated 2 years ago
- Training Transformers with knowledge localization (SGTM)☆51Jan 11, 2026Updated 3 months ago
- ☆11Oct 20, 2022Updated 3 years ago
- uyghur text resource crawled from website☆12Dec 25, 2015Updated 10 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆35Sep 24, 2024Updated last year
- This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets re…☆13Oct 8, 2025Updated 7 months ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 7 months ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated 2 years ago
- Target speaker automatic speech recognition (TS-ASR)☆13Oct 14, 2023Updated 2 years ago
- kaldi cnn-tdnnf baseline☆13Aug 31, 2021Updated 4 years ago