mzarvandi/SER-wav2vec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mzarvandi/SER-wav2vec)

mzarvandi / SER-wav2vec

Speech Emotion Recognition using transfer learning with wav2vec on IEMOCAP.

☆17

Alternatives and similar repositories for SER-wav2vec

Users that are interested in SER-wav2vec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OmarMohammed88 / AR-Emotion-Recognition
View on GitHub
An implementation of the paper titled "Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset" https://…
☆16Feb 17, 2022Updated 4 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
skakouros / s3prl_attentive_correlation
View on GitHub
Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit
☆13Nov 18, 2022Updated 3 years ago
b04901014 / FT-w2v2-ser
View on GitHub
Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
☆152Oct 26, 2021Updated 4 years ago
habla-liaa / ser-with-w2v2
View on GitHub
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
☆140Jan 6, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
PiotrSobczak / speech-emotion-recognition
View on GitHub
Multi-modal Speech Emotion Recogniton on IEMOCAP dataset
☆96Jul 6, 2023Updated 3 years ago
Sreyan88 / MMER
View on GitHub
Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learning for Speech Emotion Recognition
☆83Mar 12, 2024Updated 2 years ago
SonyCSLParis / vqcpc-gan
View on GitHub
VQCPC-GAN: Variable-length Adversarial Audio Synthesis using Vector-Quantized Contrastive Predictive Coding
☆14Apr 27, 2021Updated 5 years ago
SuperKogito / SER-datasets
View on GitHub
A collection of datasets for the purpose of emotion recognition/detection in speech.
☆420Sep 30, 2024Updated last year
Ztrimus / speech-emotion-recognition
View on GitHub
Predicting various emotion in human speech signal by detecting different speech components affected by human emotion.
☆49Aug 2, 2024Updated last year
mechanicalsea / lighthubert
View on GitHub
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
☆73Sep 26, 2022Updated 3 years ago
Demfier / multimodal-speech-emotion-recognition
View on GitHub
Lightweight and Interpretable ML Model for Speech Emotion Recognition and Ambiguity Resolution (trained on IEMOCAP dataset)
☆450Dec 21, 2023Updated 2 years ago
wq2012 / SimpleDER
View on GitHub
A lightweight library to compute Diarization Error Rate (DER).
☆62Jan 14, 2026Updated 6 months ago
HappyColor / Vesper
View on GitHub
A Compact and Effective Pretrained Model for Speech Emotion Recognition
☆55Apr 10, 2026Updated 3 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
tabahi / formantfeatures
View on GitHub
Extract frequency, power, width and dissonance of formants from wav files
☆28Jun 3, 2022Updated 4 years ago
Vincent-ZHQ / CA-MSER
View on GitHub
Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information
☆163Nov 27, 2023Updated 2 years ago
SeanNobel / speech-decoding
View on GitHub
Reimplementation of speech decoding 2022 paper by MetaAI
☆14Oct 17, 2023Updated 2 years ago
rendchevi / daisy-tts
View on GitHub
🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
☆14Nov 15, 2025Updated 8 months ago
nixiieee / RAVEN
View on GitHub
RAVEN: Recognition of Audio-Visual Emotional Nuances - a project on building multimodal emotion recognition system
☆16Jun 24, 2025Updated last year
Vaibhavs10 / how-to-asr
View on GitHub
☆18Aug 29, 2022Updated 3 years ago
lstrgar / ss-phoneme-seg
View on GitHub
Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…
☆55Nov 4, 2022Updated 3 years ago
lixiangucas01 / GLAM
View on GitHub
This is the official code for paper "Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation" published…
☆49Apr 11, 2022Updated 4 years ago
tsun / APA
View on GitHub
Domain Adaptation with Adversarial Training on Penultimate Activations (AAAI 2023)
☆11Aug 1, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zy-du / Disentanglement-of-Emotional-Style-and-Speaker-Identity-for-Expressive-Voice-Conversion
View on GitHub
This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…
☆21Sep 18, 2023Updated 2 years ago
ThomasWestfechtel / GSDE
View on GitHub
Gradual Source Domain Expansion for Unsupervised Domain Adaptation
☆14Jun 10, 2025Updated last year
NariFan2002 / AttA-NET
View on GitHub
ATTENTION AGGREGATION NETWORK FOR AUDIO-VISUAL EMOTION RECOGNITION
☆14Sep 25, 2023Updated 2 years ago
daniyo27 / XPS8940-OpenCore
View on GitHub
OpenCore EFI config for Dell XPS 8940 & possibly G5 5090
☆10May 14, 2021Updated 5 years ago
xiaomi1024 / code_SAMS
View on GitHub
☆13Jan 11, 2024Updated 2 years ago
Choddeok / DiEmo-TTS
View on GitHub
[INTERSPEECH 2025] The official implementation of DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for…
☆17Jul 16, 2026Updated last week
LeBenchmark / Interspeech2021
View on GitHub
This repository describes our reproducible framework for assessing self-supervised representation learning from speech
☆52Oct 8, 2021Updated 4 years ago
janaal1 / DCASE2020-Task3
View on GitHub
☆15Oct 15, 2020Updated 5 years ago
yao-papercodes / AGLRLS
View on GitHub
Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition (TMM 2024)
☆17Aug 13, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jefflai108 / Semi-Supervsied-Spoken-Language-Understanding-PyTorch
View on GitHub
Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining
☆12Mar 23, 2021Updated 5 years ago
IoannisKansizoglou / Iemocap-preprocess
View on GitHub
Multimodal preprocessing on IEMOCAP dataset
☆13Jun 8, 2018Updated 8 years ago
pquochuy / dcase2020-seld
View on GitHub
Source code of the DCASE 2020 SELD submission "Audio Event Detection and Localization with Multitask Regression Network"
☆17Jul 8, 2020Updated 6 years ago
SeanPLeary / dc_tts-transfer-learning
View on GitHub
Transfer learning exploration of dc_tts text-to-speech model
☆21Mar 5, 2019Updated 7 years ago
kehanlu / Mandarin-Wav2Vec2
View on GitHub
Pre-trained Wav2vec2.0 for Mandarin
☆43Oct 30, 2022Updated 3 years ago
ozdemirozcelik / coint-tools
View on GitHub
Pair Trading Analysis & Exercises Toolkit [Jupyter Notebook]
☆13Nov 3, 2023Updated 2 years ago
jqueguiner / wav2vec2-sprint
View on GitHub
docker for HF wav2vec2-sprint
☆13Mar 26, 2021Updated 5 years ago