Visual Relation Grounding in Videos (ECCV'20, Spotlight)
☆57Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for vRGV
Users that are interested in vRGV are comparing it to the libraries listed below
Sorting:
- To keep updates with VRU Grand Challenge, please use https://github.com/NExTplusplus/VidVRD-helper☆102Jan 24, 2022Updated 4 years ago
- This repository contains the main baselines introduced in WSSTG (ACL 2019).☆56Jul 8, 2024Updated last year
- This repository provides the dataset introduced by our WSSTG paper☆13Jul 21, 2019Updated 6 years ago
- [CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)☆69Jun 10, 2020Updated 5 years ago
- Code and data for the project "Visually grounded continual learning of compositional semantics"☆22Dec 27, 2022Updated 3 years ago
- Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)☆135Jul 25, 2024Updated last year
- Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge☆40Dec 5, 2022Updated 3 years ago
- [ESWA 2025] Official pytorch implementation of "What and When to look?: Temporal Span Proposal Network for Video Relation Detection"☆16Aug 9, 2021Updated 4 years ago
- Implementation of paper "Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Los…☆30Jun 29, 2020Updated 5 years ago
- This repository provides the dataset introduced by the paper "Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentenc…☆69May 1, 2020Updated 5 years ago
- [CVPR'19] [PyTorch] Gated Spatio Temporal Energy Graph☆153Feb 20, 2020Updated 6 years ago
- Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…☆59Mar 24, 2023Updated 2 years ago
- This is the repo for Multi-level textual grounding☆34Jul 21, 2020Updated 5 years ago
- ☆15Aug 12, 2022Updated 3 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 5 years ago
- Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)☆34Sep 17, 2022Updated 3 years ago
- Video Graph Transformer for Video Question Answering (ECCV'22)☆49Jun 8, 2023Updated 2 years ago
- Pre-trained V+L Data Preparation☆46Jun 2, 2020Updated 5 years ago
- Codes for our ACM MM 2019 paper: "Exploiting Temporal Relationships in Video Moment Localization with Natural Language"☆16Oct 22, 2022Updated 3 years ago
- Code for the paper "Controllable Video Captioning with an Exemplar Sentence"☆12Apr 14, 2021Updated 4 years ago
- This is the code of ECCV 2022 (Oral) paper "Fine-Grained Scene Graph Generation with Data Transfer".☆103Jan 24, 2023Updated 3 years ago
- PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision☆46Jul 29, 2020Updated 5 years ago
- Scene Graph Parsing as Dependency Parsing☆41May 22, 2019Updated 6 years ago
- CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment☆22Apr 15, 2022Updated 3 years ago
- A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)☆149Nov 18, 2020Updated 5 years ago
- This is an implementation of "Grounding of Textual Phrases in Images by Reconstruction" in PyTorch☆18Apr 7, 2020Updated 5 years ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019☆92Aug 9, 2019Updated 6 years ago
- ☆12Mar 12, 2023Updated 3 years ago
- Scene Graph Prediction with Limited Labels☆54Oct 3, 2023Updated 2 years ago
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- Learning phrase grounding from captioned images through InfoNCE bound on mutual information☆74Aug 22, 2020Updated 5 years ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆185Aug 2, 2025Updated 7 months ago
- A Dataset for Grounded Video Description☆164Jan 4, 2022Updated 4 years ago
- Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos☆71Sep 7, 2021Updated 4 years ago
- Source code of "Training Free Graph Neural Networks for Graph Matching"☆12Jul 9, 2022Updated 3 years ago
- Multi-faceted Video Moment Localizer☆17Jun 19, 2020Updated 5 years ago
- Implementation for the AAAI2019 paper "Large-scale Visual Relationship Understanding"☆146Sep 3, 2019Updated 6 years ago
- ☆26Oct 8, 2021Updated 4 years ago