TheShadow29/VidSitu

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TheShadow29/VidSitu)

TheShadow29 / VidSitu

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

☆61

Alternatives and similar repositories for VidSitu

Users that are interested in VidSitu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zeeshank95 / GVSR
View on GitHub
☆14Dec 9, 2023Updated 2 years ago
m-bain / CondensedMovies-chall
View on GitHub
Condensed Movies Challenge 2021
☆22Sep 21, 2022Updated 3 years ago
Liuxiyao / SGAP-Net
View on GitHub
SGAP-Net: Semantic-Guided Attentive Prototypes Network for Few-Shot Human-Object Interaction Recognition, AAAI2020.
☆14Dec 15, 2020Updated 5 years ago
JingweiJ / ActionGenome
View on GitHub
A video database bridging human actions and human-object relationships
☆165Jun 30, 2020Updated 6 years ago
cdancette / detect-shortcuts
View on GitHub
Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
☆29Jul 1, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jamespark3922 / visual-comet
View on GitHub
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
☆87Jun 12, 2023Updated 3 years ago
m-bain / CondensedMovies
View on GitHub
Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]
☆205Sep 21, 2022Updated 3 years ago
jayleicn / VideoLanguageFuturePred
View on GitHub
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
☆52Aug 20, 2022Updated 3 years ago
chaoyuaw / lvu
View on GitHub
☆87Mar 4, 2024Updated 2 years ago
haoyiq114 / VALOR
View on GitHub
Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)
☆16Apr 23, 2024Updated 2 years ago
khuangaf / ZeroFEC
View on GitHub
Official implementation of the ACL 2023 paper: "Zero-shot Faithful Factual Error Correction"
☆17Aug 14, 2023Updated 2 years ago
v-iashin / MDVC
View on GitHub
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
☆144Apr 8, 2023Updated 3 years ago
Kenneth-Wong / het-eccv20
View on GitHub
Codes for ECCV paper: "Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation"
☆16Jul 20, 2020Updated 6 years ago
MCG-NJU / TRACE
View on GitHub
[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation
☆60Aug 27, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Soldelli / MAD
View on GitHub
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
☆177Oct 22, 2023Updated 2 years ago
Lookuz / VidHal
View on GitHub
Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs
☆14Apr 23, 2026Updated 3 months ago
rowanz / merlot
View on GitHub
MERLOT: Multimodal Neural Script Knowledge Models
☆226Mar 15, 2022Updated 4 years ago
ruotianluo / refexp-comprehension
View on GitHub
Referring expression comprehension on ReferIt(RefClef)
☆10Nov 28, 2016Updated 9 years ago
jayleicn / ClipBERT
View on GitHub
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…
☆730Aug 8, 2023Updated 2 years ago
xinyadu / RGQA
View on GitHub
☆17Nov 14, 2022Updated 3 years ago
antoine77340 / howto100m
View on GitHub
Code for the HowTo100M paper
☆304Mar 10, 2020Updated 6 years ago
NVIDIA / ContrastiveLosses4VRD
View on GitHub
Implementation for the CVPR2019 paper "Graphical Contrastive Losses for Scene Graph Generation"
☆199Apr 2, 2020Updated 6 years ago
TheShadow29 / vognet-pytorch
View on GitHub
[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)
☆69Jun 10, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
movienet / movienet-tools
View on GitHub
Tools for movie and video research
☆313Jun 20, 2022Updated 4 years ago
jimmy646 / violin
View on GitHub
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
☆161Apr 29, 2020Updated 6 years ago
allenai / swig
View on GitHub
Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)
☆71Mar 19, 2021Updated 5 years ago
syuqings / video-paragraph
View on GitHub
Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021
☆66Oct 21, 2021Updated 4 years ago
MikeWangWZHL / VidIL
View on GitHub
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆117Sep 15, 2022Updated 3 years ago
alibaba-mmai-research / HiCo
View on GitHub
CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
☆18Aug 10, 2022Updated 3 years ago
tejas-gokhale / vqa_mutant
View on GitHub
☆13Feb 14, 2022Updated 4 years ago
xlliu7 / MUSES
View on GitHub
[CVPR 2021] Multi-shot Temporal Event Localization: a Benchmark
☆55Mar 19, 2022Updated 4 years ago
WayneTomas / TransCP
View on GitHub
[TPAMI 2024] This is the official Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding"…
☆28May 8, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JaesungHuh / VoxMovies
View on GitHub
Evaluation script for VoxMovies dataset in PyTorch
☆23Jan 12, 2024Updated 2 years ago
md-mohaiminul / TranS4mer
View on GitHub
☆34Jun 2, 2023Updated 3 years ago
chitwansaharia / HACAModel
View on GitHub
Implementation of "Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning" (https://arxiv.…
☆26Nov 3, 2018Updated 7 years ago
TheShadow29 / awesome-grounding
View on GitHub
awesome grounding: A curated list of research papers in visual grounding
☆1,126Sep 21, 2025Updated 10 months ago
daqingliu / NMTree
View on GitHub
Code release for Learning to Assemble Neural Module Tree Networks for Visual Grounding (ICCV 2019)
☆38Nov 23, 2019Updated 6 years ago
maximek3 / e-ViL
View on GitHub
☆41Nov 23, 2022Updated 3 years ago
aurooj / SHG-VQA
View on GitHub
Learning Situation Hyper-Graphs for Video Question Answering
☆23Feb 16, 2024Updated 2 years ago