Visual Relation Grounding in Videos (ECCV'20, Spotlight)
☆57Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for vRGV
Users that are interested in vRGV are comparing it to the libraries listed below
Sorting:
- To keep updates with VRU Grand Challenge, please use https://github.com/NExTplusplus/VidVRD-helper☆102Jan 24, 2022Updated 4 years ago
- This repository contains the main baselines introduced in WSSTG (ACL 2019).☆56Jul 8, 2024Updated last year
- Code and data for the project "Visually grounded continual learning of compositional semantics"☆22Dec 27, 2022Updated 3 years ago
- This repository provides the dataset introduced by our WSSTG paper☆13Jul 21, 2019Updated 6 years ago
- Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)☆134Jul 25, 2024Updated last year
- Implementation of paper "Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Los…☆30Jun 29, 2020Updated 5 years ago
- [CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)☆69Jun 10, 2020Updated 5 years ago
- This repository provides the dataset introduced by the paper "Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentenc…☆69May 1, 2020Updated 5 years ago
- Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…☆59Mar 24, 2023Updated 2 years ago
- [CVPR'19] [PyTorch] Gated Spatio Temporal Energy Graph☆153Feb 20, 2020Updated 6 years ago
- Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge☆39Dec 5, 2022Updated 3 years ago
- [ESWA 2025] Official pytorch implementation of "What and When to look?: Temporal Span Proposal Network for Video Relation Detection"☆16Aug 9, 2021Updated 4 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 5 years ago
- CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment☆22Apr 15, 2022Updated 3 years ago
- Pre-trained V+L Data Preparation☆46Jun 2, 2020Updated 5 years ago
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision☆46Jul 29, 2020Updated 5 years ago
- A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)☆148Nov 18, 2020Updated 5 years ago
- This is the repo for Multi-level textual grounding☆34Jul 21, 2020Updated 5 years ago
- Codes for our ACM MM 2019 paper: "Exploiting Temporal Relationships in Video Moment Localization with Natural Language"☆16Oct 22, 2022Updated 3 years ago
- Code for the paper "Controllable Video Captioning with an Exemplar Sentence"☆12Apr 14, 2021Updated 4 years ago
- Scene Graph Parsing as Dependency Parsing☆41May 22, 2019Updated 6 years ago
- Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019☆92Aug 9, 2019Updated 6 years ago
- Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos☆71Sep 7, 2021Updated 4 years ago
- Learning phrase grounding from captioned images through InfoNCE bound on mutual information☆74Aug 22, 2020Updated 5 years ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- PyTorch implementation of "TALL: Temporal Activity Localization via Language Query. Gao et al. ICCV2017."☆14Apr 20, 2019Updated 6 years ago
- Referring Expression Object Segmentation with Caption-Aware Consistency, BMVC 2019☆31Apr 21, 2021Updated 4 years ago
- [NeurIPS 2022] Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding☆53Mar 5, 2024Updated last year
- This is an implementation of "Grounding of Textual Phrases in Images by Reconstruction" in PyTorch☆17Apr 7, 2020Updated 5 years ago
- Graph-Structured Referring Expressions Reasoning in The Wild, In CVPR 2020, Oral.☆116Aug 10, 2020Updated 5 years ago
- [ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering☆133Oct 25, 2022Updated 3 years ago
- RareAct: A video dataset of unusual interactions☆33Aug 4, 2020Updated 5 years ago
- Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)☆34Sep 17, 2022Updated 3 years ago
- PyTorch bottom-up attention with Detectron2☆239Jan 4, 2022Updated 4 years ago
- Multi-faceted Video Moment Localizer☆17Jun 19, 2020Updated 5 years ago
- BISON: Binary Image SelectiON☆49Sep 15, 2021Updated 4 years ago
- A Dataset for Grounded Video Description☆163Jan 4, 2022Updated 4 years ago