edsonroteia / cav-mae-syncView external linksLinks
[CVPR25] Official Implementation of CAV-MAE Sync
☆30Jun 18, 2025Updated 7 months ago
Alternatives and similar repositories for cav-mae-sync
Users that are interested in cav-mae-sync are comparing it to the libraries listed below
Sorting:
- ☆12Mar 24, 2024Updated last year
- Visual Speech Recognition For Low-Resource Languages with Automatic Labels (ICASSP 2024)☆16Mar 17, 2025Updated 10 months ago
- ☆12Mar 12, 2023Updated 2 years ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Mar 17, 2025Updated 10 months ago
- Code for the C2KD paper (ICASSP 2023)☆18May 15, 2023Updated 2 years ago
- This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, w…☆43Feb 21, 2023Updated 2 years ago
- Temperature Schedules for self-supervised contrastive methods on long-tail data (ICLR'23)☆18Apr 25, 2023Updated 2 years ago
- ☆23Dec 5, 2023Updated 2 years ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 2 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆53Mar 30, 2022Updated 3 years ago
- [CHIL 2024] Interpretation of Intracardiac Electrograms Through Textual Representations☆12Sep 4, 2024Updated last year
- Official repository for the MMFM challenge☆25Jun 18, 2024Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Nov 29, 2023Updated 2 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 3 years ago
- ICCV 2021☆34May 11, 2022Updated 3 years ago
- Official Implementation of "Interpretable 3D Neural Object Volumes for Robust Conceptual Reasoning." ICLR 2026.☆30Feb 3, 2026Updated last week
- Code for Learning to Learn Language from Narrated Video☆33Oct 3, 2023Updated 2 years ago
- ☆36Jul 9, 2025Updated 7 months ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 3 months ago
- ☆17Apr 25, 2023Updated 2 years ago
- Ace-Step Dataset Generator☆23Sep 27, 2025Updated 4 months ago
- Code for "RADSeg Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models"☆28Jan 27, 2026Updated 2 weeks ago
- Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)☆89Jul 25, 2024Updated last year
- A lightweight AI agent which reads and summarizes research papers for you.☆12Jun 1, 2025Updated 8 months ago
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API.☆17Jul 21, 2025Updated 6 months ago
- DO with Terraform and Ansible☆11Jun 5, 2018Updated 7 years ago
- ☆13Oct 25, 2024Updated last year
- Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf☆12Dec 2, 2024Updated last year
- A framework for building speech-enabled websites.☆10Jul 10, 2015Updated 10 years ago
- ☆10Feb 19, 2021Updated 4 years ago
- ☆15Feb 4, 2021Updated 5 years ago
- A set of markdown file templates for Claude Code to make getting started MUCH easier!☆31Feb 6, 2026Updated last week
- awesome-audio-visual-robustness☆11Jan 27, 2024Updated 2 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆10Feb 22, 2022Updated 3 years ago
- A reconstruction framework for materializing subjective experiences from brain signals☆13Jan 18, 2025Updated last year
- Compatible with all CUDA cards. Windows and linux☆22Aug 18, 2025Updated 5 months ago
- ☆11Nov 5, 2025Updated 3 months ago