[ICASSPW] A Vector Quantized Masked AutoEncoder for speech emotion recognition
☆29Mar 4, 2024Updated 2 years ago
Alternatives and similar repositories for VQ-MAE-S-code
Users that are interested in VQ-MAE-S-code are comparing it to the libraries listed below
Sorting:
- [EACL 2023] Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models…☆18May 7, 2024Updated last year
- Implementation of SoundStream, an end-to-end neural audio codec☆32Jun 11, 2023Updated 2 years ago
- A pytorch implementation of Speech emotion recognition using deep 1D & 2D CNN LSTM networks☆27Oct 4, 2023Updated 2 years ago
- [EMNLP 2023] An Empirical Exploration of Cross-domain Alignment between Language and Electroencephalogram☆29Nov 9, 2023Updated 2 years ago
- Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.☆28May 1, 2024Updated last year
- ZuCo Reading Task Classification Benchmark using EEG and Eye-Tracking Data☆35Aug 22, 2023Updated 2 years ago
- An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning …☆33May 18, 2022Updated 3 years ago
- Supercharge your Gaianet node by generating a vector knowledge base from any API. Demo slides: https://hackmd.io/@santteegt/ByoykY4nC#/ L…☆11Nov 29, 2024Updated last year
- ☆12Nov 12, 2024Updated last year
- Anki add-on that adds Pinyin and Zhuyin readings above Chinese characters in any field.☆12Sep 23, 2025Updated 5 months ago
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking☆45Aug 23, 2024Updated last year
- ☆13Jan 2, 2025Updated last year
- Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023☆12Aug 24, 2025Updated 6 months ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated last year
- Companion code for Awe the Audience: How the Narrative Trajectories Affect Audience Perception in Public Speaking☆14Jan 6, 2018Updated 8 years ago
- Codes for Spiking Neural Networks with Improved Inherent Recurrence Dynamics for Sequential Learning☆11May 5, 2022Updated 3 years ago
- [ICLR 2026] Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing☆25Jan 27, 2026Updated last month
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- This repository contains the speaker labeled information of VoxCeleb2 and LRS3 audio-visual datasets. (AAAI 2025)☆13Sep 6, 2024Updated last year
- ☆13Oct 3, 2025Updated 5 months ago
- SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.☆11Nov 15, 2025Updated 3 months ago
- ☆11Sep 1, 2024Updated last year
- ☆10Aug 16, 2024Updated last year
- ☆53Sep 20, 2024Updated last year
- A reconstruction framework for materializing subjective experiences from brain signals☆13Jan 18, 2025Updated last year
- First neural GPT aligned with text and speech. Welcome to join us to make better foundation model in neural modality.☆14Oct 30, 2024Updated last year
- Continual Online Recalibration with Pseudo-labels☆12Jun 20, 2024Updated last year
- The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…☆10Oct 12, 2023Updated 2 years ago
- A jekyll template derived from Minimal Mistakes and inspired by academicpages. To see an example of what a webpage might look like with t…☆15May 22, 2018Updated 7 years ago
- An up-to-date & curated list of awesome semi-supervised segmentation papers, methods & resources.☆13Dec 22, 2023Updated 2 years ago
- Conditional EEG diffusion model☆16Apr 5, 2024Updated last year
- Audio-Visual Speech Recognition☆20Jul 7, 2025Updated 7 months ago
- Using GAN to create synthetic and partially synthetic EEG data to augment training sets for motor imagery interaction tasks☆13Aug 27, 2019Updated 6 years ago
- ☆12Feb 27, 2024Updated 2 years ago
- An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.☆10May 13, 2020Updated 5 years ago
- Comparing performance of different InfoNCE type losses used in contrastive learning.☆14Jun 12, 2024Updated last year
- Official implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).☆78Jun 8, 2025Updated 8 months ago
- Public dataset developed by KICT_INTFLOW for IITP AI GrandChallenge 2019, Track-3☆13Mar 4, 2020Updated 6 years ago