[CVPR25] Official Implementation of CAV-MAE Sync
☆30Jun 18, 2025Updated 8 months ago
Alternatives and similar repositories for cav-mae-sync
Users that are interested in cav-mae-sync are comparing it to the libraries listed below
Sorting:
- Visual Speech Recognition For Low-Resource Languages with Automatic Labels (ICASSP 2024)☆16Mar 17, 2025Updated 11 months ago
- ☆12Mar 24, 2024Updated last year
- WildVSR☆21Dec 13, 2023Updated 2 years ago
- Code for the C2KD paper (ICASSP 2023)☆19May 15, 2023Updated 2 years ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Mar 17, 2025Updated 11 months ago
- This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, w…☆43Feb 21, 2023Updated 3 years ago
- Temperature Schedules for self-supervised contrastive methods on long-tail data (ICLR'23)☆18Apr 25, 2023Updated 2 years ago
- ☆23Dec 5, 2023Updated 2 years ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆54Mar 30, 2022Updated 3 years ago
- [CHIL 2024] Interpretation of Intracardiac Electrograms Through Textual Representations☆12Sep 4, 2024Updated last year
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- ICCV 2021☆34May 11, 2022Updated 3 years ago
- Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language☆86Jun 12, 2024Updated last year
- Code for Learning to Learn Language from Narrated Video☆33Oct 3, 2023Updated 2 years ago
- ☆36Jul 9, 2025Updated 7 months ago
- ☆23Jun 19, 2025Updated 8 months ago
- Develop C++/CUDA extensions with PyTorch like Python scripts☆10Updated this week
- Ace-Step Dataset Generator☆23Sep 27, 2025Updated 5 months ago
- Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)☆90Jul 25, 2024Updated last year
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 4 months ago
- A lightweight AI agent which reads and summarizes research papers for you.☆12Jun 1, 2025Updated 9 months ago
- awesome-audio-visual-robustness☆11Jan 27, 2024Updated 2 years ago
- The DistanceMetrics package is a comprehensive Python library designed to compute a wide variety of distance metrics between two vectors,…☆15Sep 25, 2025Updated 5 months ago
- WavSpA: Wavelet Space Attention for Enhancing Transformer's Long Sequence Learning☆12Feb 24, 2024Updated 2 years ago
- ISS Tracker for the Cardputer Adv☆36Jan 19, 2026Updated last month
- This repository contains the official implementation of the paper "LandSegmenter: Towards a Flexible Foundation Model for Land Use and La…☆26Dec 8, 2025Updated 2 months ago
- A framework for building speech-enabled websites.☆10Jul 10, 2015Updated 10 years ago
- Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf☆12Dec 2, 2024Updated last year
- ☆13Oct 25, 2024Updated last year
- ☆10Feb 19, 2021Updated 5 years ago
- A CUDA powered audio decoding framework for FLAC.☆11May 22, 2018Updated 7 years ago
- ☆18Dec 3, 2021Updated 4 years ago
- ☆15Feb 4, 2021Updated 5 years ago
- A set of markdown file templates for Claude Code to make getting started MUCH easier!☆34Feb 6, 2026Updated last month
- DO with Terraform and Ansible☆11Jun 5, 2018Updated 7 years ago
- Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023☆46Sep 1, 2024Updated last year
- Companion toolkit of the 'Serial Speakers' dataset.☆11Feb 17, 2020Updated 6 years ago
- ☆12Feb 27, 2024Updated 2 years ago