msaadsaeed / SBNetLinks
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆12Updated 2 years ago
Alternatives and similar repositories for SBNet
Users that are interested in SBNet are comparing it to the libraries listed below
Sorting:
- This repository contains the speaker labeled information of VoxCeleb2 and LRS3 audio-visual datasets. (AAAI 2025)☆12Updated last year
- ☆11Updated 2 months ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆78Updated 11 months ago
- Implementation of Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection paper☆60Updated 2 years ago
- Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.☆25Updated 2 years ago
- Implementation of "SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection" paper☆39Updated 2 years ago
- (SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition☆13Updated last year
- PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)☆25Updated last year
- Official implementation of the INTERSPEECH 2024 paper: Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detect…☆54Updated last year
- ☆19Updated 2 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12Updated last year
- ☆17Updated 2 years ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Updated 11 months ago
- ☆24Updated last year
- ☆48Updated 3 years ago
- Pytorch implementation of "LEVERAGING POSITIONAL-RELATED LOCAL-GLOBAL DEPENDENCY FOR SYNTHETIC SPEECH DETECTION"☆37Updated 2 years ago
- Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"☆18Updated 3 years ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆35Updated 2 years ago
- [ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…☆60Updated last year
- A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024☆20Updated last year
- SSL Layerwise analysis for speech deepfake detection☆32Updated 5 months ago
- ☆27Updated 2 years ago
- ☆35Updated last year
- ☆20Updated last year
- Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…☆56Updated 2 weeks ago
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆18Updated last year
- Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit☆13Updated 3 years ago
- Official implementation of FOP method as described in "Fusion and Orthogonal Projection for Improved Face-Voice Association"☆20Updated last month
- Continual Learning Method RAWM for ICML 2023☆23Updated last year
- ☆92Updated 4 years ago