Code for Look for the Change paper published at CVPR 2022
☆36Oct 26, 2022Updated 3 years ago
Alternatives and similar repositories for LookForTheChange
Users that are interested in LookForTheChange are comparing it to the libraries listed below
Sorting:
- ChangeIt dataset with more than 2600 hours of video with state-changing actions published at CVPR 2022☆11Mar 23, 2022Updated 3 years ago
- Code for the paper "Multi-Task Learning of Object States and State-Modifying Actions from Web Videos" published in TPAMI☆11Mar 3, 2024Updated last year
- Unifying Specialized Visual Encoders for Video Language Models☆25Nov 22, 2025Updated 3 months ago
- Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022☆39Feb 17, 2023Updated 3 years ago
- [NeurIPS 2021] Introspective Distillation for Robust Question Answering☆13Dec 7, 2021Updated 4 years ago
- ☆17Sep 2, 2023Updated 2 years ago
- Code for the paper "GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos" published at CVPR 2024☆53Mar 3, 2024Updated last year
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37May 23, 2023Updated 2 years ago
- ☆31Dec 17, 2025Updated 2 months ago
- Official implementation of the paper How to Listen? Rethinking Visual Sound Localization☆17Apr 25, 2022Updated 3 years ago
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Jan 18, 2024Updated 2 years ago
- ☆13Feb 14, 2022Updated 4 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆116Sep 15, 2022Updated 3 years ago
- A pip installable package for optimal transport inspired loss functions in the spectral domain. Can be used for audio applications such a…☆29Dec 5, 2025Updated 2 months ago
- Living in Rochester with joy (especially for University of Rochester)☆26Aug 5, 2019Updated 6 years ago
- Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)☆23Jun 26, 2020Updated 5 years ago
- [ICLR 2024 Poster] SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos☆20Aug 21, 2025Updated 6 months ago
- Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"☆50Jan 27, 2025Updated last year
- [CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"☆56Aug 8, 2023Updated 2 years ago
- Official Repository for CVPR 2022 paper "REX: Reasoning-aware and Grounded Explanation"☆22Nov 21, 2023Updated 2 years ago
- ☆26May 19, 2022Updated 3 years ago
- BottomUpTopDown VQA model with question-type debiasing☆22Oct 6, 2019Updated 6 years ago
- Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).☆27Apr 3, 2022Updated 3 years ago
- ☆27Oct 7, 2021Updated 4 years ago
- Methods of training NLP models to ignored biased strategies☆55May 22, 2023Updated 2 years ago
- Source code for "Rethinking training of 3D GANs"☆31May 26, 2022Updated 3 years ago
- GQA-OOD is a new dataset and benchmark for the evaluation of VQA models in OOD (out of distribution) settings.☆32Mar 1, 2021Updated 5 years ago
- Official repository for TikTok-DeepFake (TT-DF)☆13Feb 17, 2025Updated last year
- Codebase for AAAI 2024 conference paper Visual Chain-of-Thought Prompting for Knowledge-based Visual Reasoning☆39Mar 12, 2025Updated 11 months ago
- Accepted by CVPR 2020.☆27Jul 11, 2024Updated last year
- [CVPR 2025] Offcial implementation of PhysFlow: Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dyna…☆36Jul 28, 2025Updated 7 months ago
- Charades Object Detection Dataset (ICCV 2017)☆31May 30, 2018Updated 7 years ago
- Code for CVPR'18 "Grounding Referring Expressions in Images by Variational Context"☆30Jul 4, 2018Updated 7 years ago
- Code release for "Language-conditioned Detection Transformer"☆88Jun 17, 2024Updated last year
- EPIC-Kitchens-100 Action Recognition baselines: TSN, TRN, TSM☆33Mar 15, 2022Updated 3 years ago
- End to End implementation of Topic Modeling using LDA☆11Jun 26, 2021Updated 4 years ago
- Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action☆37Apr 3, 2023Updated 2 years ago
- [NeurIPS 2025] Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models☆64Nov 27, 2025Updated 3 months ago
- O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)☆37Feb 22, 2022Updated 4 years ago