Speech Emotion Recognition using transfer learning with wav2vec on IEMOCAP.
☆17Aug 8, 2021Updated 4 years ago
Alternatives and similar repositories for SER-wav2vec
Users that are interested in SER-wav2vec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Dec 6, 2022Updated 3 years ago
- Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit☆13Nov 18, 2022Updated 3 years ago
- Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition☆151Oct 26, 2021Updated 4 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆139Jan 6, 2025Updated last year
- Multi-modal Speech Emotion Recogniton on IEMOCAP dataset☆96Jul 6, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learning for Speech Emotion Recognition☆83Mar 12, 2024Updated 2 years ago
- VQCPC-GAN: Variable-length Adversarial Audio Synthesis using Vector-Quantized Contrastive Predictive Coding☆14Apr 27, 2021Updated 5 years ago
- A collection of datasets for the purpose of emotion recognition/detection in speech.☆419Sep 30, 2024Updated last year
- Predicting various emotion in human speech signal by detecting different speech components affected by human emotion.☆49Aug 2, 2024Updated last year
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆73Sep 26, 2022Updated 3 years ago
- [INTERSPEECH 2025] The official implementation of DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for…☆17Sep 7, 2025Updated 9 months ago
- A lightweight library to compute Diarization Error Rate (DER).☆62Jan 14, 2026Updated 5 months ago
- The Hybrid Image Matching (HIM) method that combines the deep learning approach with the feature point matching to image classification.☆15Jan 9, 2019Updated 7 years ago
- Lightweight and Interpretable ML Model for Speech Emotion Recognition and Ambiguity Resolution (trained on IEMOCAP dataset)☆451Dec 21, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information☆162Nov 27, 2023Updated 2 years ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆53Apr 10, 2026Updated 2 months ago
- Extract frequency, power, width and dissonance of formants from wav files☆28Jun 3, 2022Updated 4 years ago
- mmyun☆18Aug 4, 2025Updated 10 months ago
- ☆18Aug 29, 2022Updated 3 years ago
- Real-time version of sound_classification_demo in OpenVINO toolkit. Captures audio from microphone, do classification, and display result…☆12Jul 28, 2021Updated 4 years ago
- Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…☆55Nov 4, 2022Updated 3 years ago
- This is the official code for paper "Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation" published…☆49Apr 11, 2022Updated 4 years ago
- 解决Cursor在免费订阅期间出现以下提示的问题: Too many free trial accounts used on this machine. Please upgrade to pro. We have this limit in place to preve…☆10Dec 14, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Reimplementation of speech decoding 2022 paper by MetaAI☆14Oct 17, 2023Updated 2 years ago
- ☆10Aug 14, 2023Updated 2 years ago
- ATTENTION AGGREGATION NETWORK FOR AUDIO-VISUAL EMOTION RECOGNITION☆13Sep 25, 2023Updated 2 years ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 7 months ago
- This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…☆21Sep 18, 2023Updated 2 years ago
- ☆13Jan 11, 2024Updated 2 years ago
- ☆30Dec 23, 2025Updated 5 months ago
- Minimal module for computing audio spectrograms☆15Feb 28, 2019Updated 7 years ago
- OpenCore EFI config for Dell XPS 8940 & possibly G5 5090☆10May 14, 2021Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆15Oct 15, 2020Updated 5 years ago
- Multimodal preprocessing on IEMOCAP dataset☆13Jun 8, 2018Updated 8 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆52Oct 8, 2021Updated 4 years ago
- Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining☆12Mar 23, 2021Updated 5 years ago
- Source code of the DCASE 2020 SELD submission "Audio Event Detection and Localization with Multitask Regression Network"☆17Jul 8, 2020Updated 5 years ago
- Transfer learning exploration of dc_tts text-to-speech model☆21Mar 5, 2019Updated 7 years ago
- Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition (TMM 2024)☆17Aug 13, 2024Updated last year