wanglin-lw / ST-CapsLinks
☆11Updated 3 years ago
Alternatives and similar repositories for ST-Caps
Users that are interested in ST-Caps are comparing it to the libraries listed below
Sorting:
- ☆176Updated last year
- ☆19Updated last year
- [ACMMM2025] Official released code for ALLM4ADD☆36Updated 3 months ago
- 语音方向实验室/公司/资源/实习等,欢迎推荐或自荐☆594Updated last year
- ☆130Updated 2 weeks ago
- A list of tools, papers and code related to Fake Audio Detection.☆222Updated 2 months ago
- Code for LAVSS: Location-Guided Audio-Visual Spatial Audio Separation☆18Updated 11 months ago
- public child-adult speaker diarization/classification model and codes☆18Updated 9 months ago
- This repository includes the code to reproduce our paper "Automatic speaker verification spoofing and deepfake detection using wav2vec 2.…☆158Updated 2 years ago
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆44Updated last year
- A curated list of audio-visual learning methods and datasets.☆284Updated last year
- AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…☆11Updated 2 years ago
- ☆59Updated last year
- Accepted by TMM 2022☆19Updated 3 years ago
- Deformable Speech Transformer (DST)☆35Updated last year
- This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".☆66Updated last year
- ACM MM 2022 paper_AVQA: A Dataset for Audio-Visual Question Answering on Videos☆15Updated 2 years ago
- Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learning for Speech Emotion Recognition☆81Updated last year
- ☆157Updated 3 years ago
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆256Updated last year
- Voice Face Association Learning Paper List☆17Updated 2 years ago
- PyTorch Implementation of SimulLR☆11Updated 4 years ago
- ☆64Updated last year
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆240Updated last year
- ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore t…☆518Updated 9 months ago
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆221Updated 2 months ago
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆22Updated last year
- ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'☆44Updated 3 years ago
- Code for paper "Audio Deepfake Detection with Self-supervised XLS-R and SLS classifier☆58Updated last year
- Official implement of SpeechFormer written in Python (PyTorch).☆78Updated 2 years ago