sutdcv / SUTD-TrafficQA
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
☆53Updated 5 months ago
Alternatives and similar repositories for SUTD-TrafficQA:
Users that are interested in SUTD-TrafficQA are comparing it to the libraries listed below
- Pytorch implementation of our paper Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs, which i…☆45Updated last year
- [ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer☆35Updated last year
- [CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos☆94Updated 2 years ago
- [ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation☆59Updated 2 years ago
- Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge☆39Updated 2 years ago
- ☆34Updated 3 years ago
- Span-based Localizing Network for Natural Language Video Localization (ACL 2020)☆104Updated 3 years ago
- Video Graph Transformer for Video Question Answering (ECCV'22)☆46Updated last year
- ☆22Updated 3 years ago
- ☆34Updated last year
- ☆192Updated 2 years ago
- ☆31Updated 2 years ago
- This is the code of ECCV 2022 (Oral) paper "Fine-Grained Scene Graph Generation with Data Transfer".☆98Updated 2 years ago
- [CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The…☆52Updated 6 months ago
- This repo contains code for Invariant Grounding for Video Question Answering☆26Updated last year
- praneeth11009 / LIGHTEN-Learning-Interactions-with-Graphs-and-Hierarchical-TEmporal-Networks-for-HOI☆16Updated 4 years ago
- This repository provides the dataset introduced by the paper "Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentenc…☆58Updated 4 years ago
- Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).☆47Updated last year
- Repository for the CVPR-20 paper "Local-Global Video-Text Interactions for Temporal Grounding"☆130Updated 3 years ago
- A reading list of papers about Visual Grounding.☆31Updated 2 years ago
- Code for "Mining the Benefits of Two-stage and One-stage HOI Detection"☆88Updated 10 months ago
- [NeurIPS 2022] Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding☆45Updated 10 months ago
- [CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers☆175Updated last year
- [ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"☆100Updated last year
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆118Updated last year
- Official code for the ICLR2023 paper Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection☆43Updated 7 months ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆27Updated last year
- "Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022☆68Updated 2 years ago
- [ICCV2021] Generic Event Boundary Detection: A Benchmark for Event Segmentation☆68Updated 3 years ago
- [AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding☆90Updated 2 years ago