[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
☆61Aug 17, 2021Updated 4 years ago
Alternatives and similar repositories for VidSitu
Users that are interested in VidSitu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Dec 9, 2023Updated 2 years ago
- Condensed Movies Challenge 2021☆20Sep 21, 2022Updated 3 years ago
- SGAP-Net: Semantic-Guided Attentive Prototypes Network for Few-Shot Human-Object Interaction Recognition, AAAI2020.☆14Dec 15, 2020Updated 5 years ago
- A video database bridging human actions and human-object relationships☆163Jun 30, 2020Updated 5 years ago
- VisualCOMET: Reasoning about the Dynamic Context of a Still Image☆88Jun 12, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering☆29Jul 1, 2024Updated last year
- Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]☆198Sep 21, 2022Updated 3 years ago
- [EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction☆51Aug 20, 2022Updated 3 years ago
- ☆87Mar 4, 2024Updated 2 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆16Apr 23, 2024Updated 2 years ago
- Codes for ECCV paper: "Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation"☆16Jul 20, 2020Updated 5 years ago
- PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)☆144Apr 8, 2023Updated 3 years ago
- [ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation☆60Aug 27, 2022Updated 3 years ago
- MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions☆175Oct 22, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs☆14Apr 23, 2026Updated 2 weeks ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- MERLOT: Multimodal Neural Script Knowledge Models☆226Mar 15, 2022Updated 4 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆730Aug 8, 2023Updated 2 years ago
- ☆17Nov 14, 2022Updated 3 years ago
- This is a repository for paper titled, PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Plann…☆14Nov 3, 2023Updated 2 years ago
- Code for the HowTo100M paper☆300Mar 10, 2020Updated 6 years ago
- [CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)☆69Jun 10, 2020Updated 5 years ago
- Implementation for the CVPR2019 paper "Graphical Contrastive Losses for Scene Graph Generation"☆200Apr 2, 2020Updated 6 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"☆161Apr 29, 2020Updated 6 years ago
- Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021☆66Oct 21, 2021Updated 4 years ago
- Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)☆71Mar 19, 2021Updated 5 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- Tools for movie and video research☆308Jun 20, 2022Updated 3 years ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [NeurIPS 2022] Egocentric Video-Language Pretraining☆260May 9, 2024Updated 2 years ago
- [TPAMI 2024] This is the official Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding"…☆28May 8, 2025Updated last year
- [CVPR 2021] Multi-shot Temporal Event Localization: a Benchmark☆55Mar 19, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Feb 14, 2022Updated 4 years ago
- ☆34Jun 2, 2023Updated 2 years ago
- Implementation of "Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning" (https://arxiv.…☆26Nov 3, 2018Updated 7 years ago
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Learning Situation Hyper-Graphs for Video Question Answering☆23Feb 16, 2024Updated 2 years ago
- Code release for Learning to Assemble Neural Module Tree Networks for Visual Grounding (ICCV 2019)☆39Nov 23, 2019Updated 6 years ago
- awesome grounding: A curated list of research papers in visual grounding☆1,125Sep 21, 2025Updated 7 months ago