samsad35 / VQ-MAE-S-code
A Vector Quantized Masked AutoEncoder for speech emotion recognition
☆16Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for VQ-MAE-S-code
- ☆17Updated 8 months ago
- ☆139Updated 4 months ago
- Official implement of SpeechFormer written in Python (PyTorch).☆75Updated last year
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆115Updated 7 months ago
- EMO-SUPERB submission☆28Updated 2 months ago
- Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition☆144Updated 3 years ago
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆24Updated 2 months ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆28Updated 4 months ago
- ☆55Updated 11 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆52Updated last month
- SpeechFormer++ in PyTorch☆42Updated last year
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆25Updated 3 weeks ago
- [INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark☆146Updated 5 months ago
- PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models (…☆47Updated 4 months ago
- ☆19Updated last year
- [ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations☆35Updated 11 months ago
- [CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.☆102Updated 5 months ago
- Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information☆129Updated 11 months ago
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆18Updated 7 months ago
- Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learning for Speech Emotion Recognition☆66Updated 8 months ago
- Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"☆47Updated 2 months ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆50Updated last week
- ☆59Updated 2 months ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆51Updated 5 months ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆112Updated 9 months ago
- Pytorch implementation for “V2C: Visual Voice Cloning”☆30Updated last year
- An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning …☆32Updated 2 years ago
- [WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition☆92Updated last year
- ☆98Updated 2 years ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆55Updated 4 months ago