HappyColor/SpeechFormer2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HappyColor/SpeechFormer2)

HappyColor / SpeechFormer2

SpeechFormer++ in PyTorch

☆50

Alternatives and similar repositories for SpeechFormer2

Users that are interested in SpeechFormer2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HappyColor / SpeechFormer
View on GitHub
Official implement of SpeechFormer written in Python (PyTorch).
☆78Apr 1, 2023Updated 3 years ago
HappyColor / DST
View on GitHub
Deformable Speech Transformer (DST)
☆35Aug 8, 2024Updated last year
MengboLi / MS-SENet
View on GitHub
☆11Jul 16, 2024Updated 2 years ago
HappyColor / Vesper
View on GitHub
A Compact and Effective Pretrained Model for Speech Emotion Recognition
☆54Apr 10, 2026Updated 3 months ago
scutcsq / DWFormer
View on GitHub
DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)
☆69Jul 8, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ECNU-Cross-Innovation-Lab / ShiftSER
View on GitHub
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
☆39Dec 18, 2023Updated 2 years ago
AliceOTHMANI / EmoAudioNet
View on GitHub
Here the code of EmoAudioNet is a deep neural network for speech classification (published in ICPR 2020)
☆14Jul 13, 2020Updated 6 years ago
ECNU-Cross-Innovation-Lab / ENT
View on GitHub
[ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
☆28Apr 11, 2024Updated 2 years ago
JabuMlDev / Speaker-VGG-CCT
View on GitHub
Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transfor…
☆25Feb 17, 2023Updated 3 years ago
AryaAftab / LIGHT-SERNET
View on GitHub
Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
☆83May 25, 2022Updated 4 years ago
speechandlanguageprocessing / ICASSP2022-Depression
View on GitHub
Automatic Depression Detection: a GRU/ BiLSTM-based Model and An Emotional Audio-Textual Corpus
☆216Jul 10, 2023Updated 3 years ago
adbailey1 / DepAudioNet_reproduction
View on GitHub
Reproduction of DepAudioNet by Ma et al. {DepAudioNet: An Efficient Deep Model for Audio based Depression Classification,(https://dl.acm.…
☆83Sep 9, 2021Updated 4 years ago
fchest / CSENet
View on GitHub
Csenet: Complex Squeeze-and-Excitation Network for Speech Depression Level Prediction (ICASSP 2022)
☆14Jun 23, 2022Updated 4 years ago
nhattruongpham / mmser
View on GitHub
SERVER: Multi-modal Speech Emotion Recognition using Transformer-based and Vision-based Embeddings
☆15Jan 23, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HoseinAzad / Transformer-based-SER
View on GitHub
Transformer-based model for Speech Emotion Recognition(SER) - implemented by Pytorch
☆42Apr 12, 2024Updated 2 years ago
EIHW / EmoNet
View on GitHub
☆29Mar 8, 2022Updated 4 years ago
lixiangucas01 / GLAM
View on GitHub
This is the official code for paper "Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation" published…
☆49Apr 11, 2022Updated 4 years ago
Strong-AI-Lab / emotion
View on GitHub
Emotion Recognition ToolKit (ERTK): tools for emotion recognition. Dataset processing, feature extraction, experiments,
☆57Oct 19, 2025Updated 9 months ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
Janie1996 / MSRFG
View on GitHub
The code for Multi-Scale Receptive Field Graph Model for Emotion Recognition in Conversations
☆11Jan 17, 2023Updated 3 years ago
helang818 / LMVD
View on GitHub
☆41May 7, 2024Updated 2 years ago
usc-sail / peft-ser
View on GitHub
[ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…
☆60Jul 1, 2024Updated 2 years ago
dayanavivolab / s3prl
View on GitHub
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
☆10Feb 29, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
TideDancer / interspeech21_emotion
View on GitHub
☆111Aug 10, 2022Updated 3 years ago
muzixingyun / LI-FPN
View on GitHub
LI-FPN is an excellent model for depression recognition based on facial expression.
☆16Apr 5, 2024Updated 2 years ago
Jiaxin-Ye / TIM-Net_SER
View on GitHub
[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech E…
☆191May 15, 2024Updated 2 years ago
ASolitaryMan / HFLEA
View on GitHub
FRAME-LEVEL EMOTIONAL STATE ALIGNMENT METHOD FOR SPEECH EMOTION RECOGNITION
☆23Dec 22, 2024Updated last year
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
nextDeve / Depression-detect-ResNet
View on GitHub
depression-detect Predicting depression from AVEC2014 using ResNet18.
☆60Jun 17, 2024Updated 2 years ago
xuxiaoooo / ABAFnet
View on GitHub
Attention-Based Acoustic Feature Fusion Network for Depression Detection
☆30Jun 14, 2025Updated last year
jayaneetha / emoDARTS
View on GitHub
☆10Aug 16, 2024Updated last year
adbailey1 / daic_woz_process
View on GitHub
☆73Feb 21, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
sarulab-speech / spatial_voice_conversion
View on GitHub
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
☆18Aug 8, 2024Updated last year
skakouros / s3prl_attentive_correlation
View on GitHub
Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit
☆13Nov 18, 2022Updated 3 years ago
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
Vincent-ZHQ / CA-MSER
View on GitHub
Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information
☆163Nov 27, 2023Updated 2 years ago
PingCheng-Wei / DepressionEstimation
View on GitHub
Bachelor Thesis - Deep Learning-based Multi-modal Depression Estimation
☆87Apr 29, 2023Updated 3 years ago
lstrgar / ss-phoneme-seg
View on GitHub
Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…
☆55Nov 4, 2022Updated 3 years ago
adelacvg / DPTTS
View on GitHub
An AR+AR TTS attempt.
☆18Jan 13, 2025Updated last year