We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through contextual perception and chain of Thought (CoT).
☆17Mar 3, 2025Updated last year
Alternatives and similar repositories for C2SER
Users that are interested in C2SER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement☆46Mar 10, 2025Updated last year
- Llasa Speed Up☆62Jan 18, 2026Updated 2 months ago
- wenet_LLM_from_ASLP☆15Nov 26, 2024Updated last year
- A Massive Contextual Speech Recognition Benchmark.☆105Aug 6, 2025Updated 8 months ago
- OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.☆483Nov 23, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆39Sep 25, 2025Updated 6 months ago
- Official repository for the WenetSpeech-Chuan dataset.☆169Feb 5, 2026Updated 2 months ago
- A song aesthetic evaluation toolkit trained on SongEval.☆295Jun 15, 2025Updated 9 months ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 8 months ago
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆82Jun 7, 2024Updated last year
- ☆13Jun 8, 2024Updated last year
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Jul 14, 2024Updated last year
- A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation☆291Feb 5, 2026Updated 2 months ago
- ☆34Sep 15, 2025Updated 6 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆22Jul 10, 2025Updated 9 months ago
- This is the official implementation of PGUSE☆38Jun 7, 2025Updated 10 months ago
- Blazing fast data loading with HuggingFace Dataset and Ray Data☆16Jan 12, 2024Updated 2 years ago
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆19May 23, 2024Updated last year
- An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.☆235Feb 26, 2026Updated last month
- ☆49Jul 5, 2025Updated 9 months ago
- Official code of SenSE.☆77Oct 30, 2025Updated 5 months ago
- A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows☆246Jan 8, 2026Updated 3 months ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 5 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Random Tips and Writeups.☆15Feb 21, 2019Updated 7 years ago
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆33Feb 11, 2026Updated last month
- Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion☆2,277Nov 27, 2025Updated 4 months ago
- Linux内核学习——心中的内核☆18Jun 24, 2025Updated 9 months ago
- ☆43Feb 8, 2025Updated last year
- ☆16Sep 12, 2023Updated 2 years ago
- An interactive TUI for visualizing code statistics from tokei.☆34Jan 20, 2026Updated 2 months ago
- A Large-scale Wu Dialect Speech Corpus with Multi-dimensional Annotations☆126Feb 6, 2026Updated 2 months ago
- 🐧 Ucanto UCAN RPC in Go☆13Mar 18, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Dhruva is an open-source platform for serving language AI models at scale.☆21Aug 25, 2025Updated 7 months ago
- A Diffusion Probabilistic Model for Target Sound Extraction☆40Sep 27, 2024Updated last year
- C++ version of ailia models repository☆24Dec 31, 2025Updated 3 months ago
- LibAFLGo: Evaluating and Advancing Directed Greybox Fuzzing☆25Mar 4, 2026Updated last month
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆44Mar 3, 2025Updated last year
- [ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"☆10May 5, 2024Updated last year
- Stop blaming lag. It's time to take absolute control of your connection☆22Mar 2, 2026Updated last month