[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
☆61Aug 17, 2021Updated 4 years ago
Alternatives and similar repositories for VidSitu
Users that are interested in VidSitu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Condensed Movies Challenge 2021☆20Sep 21, 2022Updated 3 years ago
- SGAP-Net: Semantic-Guided Attentive Prototypes Network for Few-Shot Human-Object Interaction Recognition, AAAI2020.☆14Dec 15, 2020Updated 5 years ago
- A video database bridging human actions and human-object relationships☆159Jun 30, 2020Updated 5 years ago
- VisualCOMET: Reasoning about the Dynamic Context of a Still Image☆88Jun 12, 2023Updated 2 years ago
- Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering☆29Jul 1, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]☆196Sep 21, 2022Updated 3 years ago
- [EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction☆51Aug 20, 2022Updated 3 years ago
- ☆87Mar 4, 2024Updated 2 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆16Apr 23, 2024Updated last year
- Official implementation of the ACL 2023 paper: "Zero-shot Faithful Factual Error Correction"☆17Aug 14, 2023Updated 2 years ago
- Codes for ECCV paper: "Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation"☆16Jul 20, 2020Updated 5 years ago
- PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)☆144Apr 8, 2023Updated 3 years ago
- [ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation☆60Aug 27, 2022Updated 3 years ago
- MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions☆174Oct 22, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs☆14Apr 19, 2025Updated 11 months ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- MERLOT: Multimodal Neural Script Knowledge Models☆226Mar 15, 2022Updated 4 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆730Aug 8, 2023Updated 2 years ago
- ☆17Nov 14, 2022Updated 3 years ago
- This is a repository for paper titled, PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Plann…☆14Nov 3, 2023Updated 2 years ago
- Code for the HowTo100M paper☆299Mar 10, 2020Updated 6 years ago
- [CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)☆69Jun 10, 2020Updated 5 years ago
- Implementation for the CVPR2019 paper "Graphical Contrastive Losses for Scene Graph Generation"☆200Apr 2, 2020Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"☆161Apr 29, 2020Updated 5 years ago
- Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021☆66Oct 21, 2021Updated 4 years ago
- Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)☆71Mar 19, 2021Updated 5 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- Tools for movie and video research☆307Jun 20, 2022Updated 3 years ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [NeurIPS 2022] Egocentric Video-Language Pretraining☆260May 9, 2024Updated last year
- [TPAMI 2024] This is the official Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding"…☆28May 8, 2025Updated 11 months ago
- [CVPR 2021] Multi-shot Temporal Event Localization: a Benchmark☆55Mar 19, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆13Feb 14, 2022Updated 4 years ago
- ☆34Jun 2, 2023Updated 2 years ago
- Implementation of "Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning" (https://arxiv.…☆26Nov 3, 2018Updated 7 years ago
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Learning Situation Hyper-Graphs for Video Question Answering☆23Feb 16, 2024Updated 2 years ago
- Code release for Learning to Assemble Neural Module Tree Networks for Visual Grounding (ICCV 2019)☆39Nov 23, 2019Updated 6 years ago
- awesome grounding: A curated list of research papers in visual grounding☆1,126Sep 21, 2025Updated 6 months ago