Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆12Aug 28, 2023Updated 2 years ago
Alternatives and similar repositories for SBNet
Users that are interested in SBNet are comparing it to the libraries listed below
Sorting:
- Official implementation of FOP method as described in "Fusion and Orthogonal Projection for Improved Face-Voice Association"☆21Dec 31, 2025Updated 2 months ago
- [IJCAI2022] Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast☆21Oct 25, 2023Updated 2 years ago
- Voice Face Association Learning Paper List☆17May 20, 2023Updated 2 years ago
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- ☆11Nov 5, 2025Updated 3 months ago
- Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"☆15Oct 25, 2024Updated last year
- Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21☆16May 14, 2022Updated 3 years ago
- ☆19Jun 8, 2021Updated 4 years ago
- Download and preprocess voxceleb datasets.☆38Jun 18, 2025Updated 8 months ago
- ☆13Jan 8, 2024Updated 2 years ago
- Chorale Music Separation Dataset and Model Framework☆40Dec 5, 2022Updated 3 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- Frequency tracking in time-frequency representations☆13Jan 19, 2021Updated 5 years ago
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆10Oct 5, 2022Updated 3 years ago
- GBDF: Gender Balanced DeepFake Dataset☆11Jul 22, 2022Updated 3 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- ☆11Aug 7, 2025Updated 6 months ago
- This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…☆12Oct 9, 2024Updated last year
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- Evaluation metrics and submission file creation scripts the Action Recognition challenge☆15Feb 9, 2026Updated 2 weeks ago
- Open Source Speech Inferencing Libary for Indic Languages☆13Apr 11, 2022Updated 3 years ago
- This repository contains the speaker labeled information of VoxCeleb2 and LRS3 audio-visual datasets. (AAAI 2025)☆13Sep 6, 2024Updated last year
- Activity Grammars for Temporal Action Segmentation (NeurIPS 2023)☆14Jun 14, 2024Updated last year
- Time frequency ridge detection based on relevant ridge portions☆11Aug 17, 2023Updated 2 years ago
- ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'☆44Oct 31, 2022Updated 3 years ago
- Simple Python script to compute equal error rate (EER) for machine learning model evaluation.☆42Mar 12, 2020Updated 5 years ago
- ☆11Nov 28, 2025Updated 3 months ago
- ☆11May 7, 2022Updated 3 years ago
- [ICTC'24] - "Voice-Based Age and Gender Recognition: A Comparative Study of LSTM, RezoNet and Hybrid CNNs-BiLSTM Architecture" by Nhut Mi…☆10Jan 16, 2025Updated last year
- Transformer based ASR Engine.☆13Aug 23, 2021Updated 4 years ago
- ☆10Nov 16, 2021Updated 4 years ago
- ☆13Sep 26, 2023Updated 2 years ago
- Spectral Clustering in C++☆17Jan 8, 2013Updated 13 years ago
- Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"☆10Jul 8, 2020Updated 5 years ago
- Human age estimation using deep neural networks (Keras)☆13Aug 10, 2023Updated 2 years ago
- Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations @ ICCV21☆13Jul 15, 2022Updated 3 years ago
- Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals☆18Aug 8, 2024Updated last year
- An exploration of LLM steering☆24Jun 15, 2024Updated last year