mispchallenge / MISP-ICME-AVSR
☆17Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for MISP-ICME-AVSR
- ☆26Updated last year
- ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'☆35Updated 2 years ago
- ☆19Updated last year
- Self-supervised Speaker Diarization Interspeech 2022 Implementation☆9Updated last month
- ☆22Updated 7 months ago
- Learning differentiable temporal resolution on time-series data.☆33Updated 2 years ago
- ☆29Updated 2 years ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆29Updated last year
- This is the code for controllable EVC framework for seen and unseen emotion generation.☆41Updated 3 years ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆31Updated 3 years ago
- This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".☆40Updated last month
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆17Updated 2 years ago
- ☆46Updated last year
- ☆43Updated last year
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Updated 8 months ago
- Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.☆19Updated last year
- Official implementation of the INTERSPEECH 2024 paper: Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detect…☆24Updated 2 months ago
- ☆19Updated 4 months ago
- ☆32Updated last week
- SASV2 baseline, a track on ASVspoof5 phase2 challenge☆22Updated 4 months ago
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆18Updated 7 months ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆11Updated 2 years ago
- PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models (…☆47Updated 4 months ago
- Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"☆38Updated last year
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.☆42Updated 9 months ago
- A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech…☆11Updated 3 months ago
- This repository collects papers related to Speech Tokenizer.☆15Updated last month
- Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Pro…☆19Updated 11 months ago
- A repo containing download guidance and corresponding scripts of the VoxBlink dataset.☆23Updated 7 months ago
- A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024☆13Updated 3 months ago