KaihuaTang / VCTree-Visual-Question-AnsweringView external linksLinks
Code for the Visual Question Answering (VQA) part of CVPR 2019 oral paper: "Learning to Compose Dynamic Tree Structures for Visual Contexts"
☆34Mar 18, 2019Updated 6 years ago
Alternatives and similar repositories for VCTree-Visual-Question-Answering
Users that are interested in VCTree-Visual-Question-Answering are comparing it to the libraries listed below
Sorting:
- Code for the Scene Graph Generation part of CVPR 2019 oral paper: "Learning to Compose Dynamic Tree Structures for Visual Contexts"☆123Jan 6, 2026Updated last month
- support Large Vocabulary Instance Segmentation (LVIS) dataset for mmdetection☆16Apr 24, 2020Updated 5 years ago
- Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019☆92Aug 9, 2019Updated 6 years ago
- The official PyTorch Implementation of the Paper "Adversarial Visual Robustness by Causal Intervention"☆18Oct 6, 2021Updated 4 years ago
- Pytorch implementation of "Explainable and Explicit Visual Reasoning over Scene Graphs "☆93Mar 17, 2019Updated 6 years ago
- [NeurIPS 2021] Introspective Distillation for Robust Question Answering☆13Dec 7, 2021Updated 4 years ago
- ☆12Mar 8, 2021Updated 4 years ago
- ☆17Sep 2, 2023Updated 2 years ago
- Repository to generate CLEVR-Dialog: A diagnostic dataset for Visual Dialog☆49Feb 18, 2020Updated 5 years ago
- Egocentric Video Description based on Temporally-Linked Sequences☆11Jul 17, 2017Updated 8 years ago
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- Re-implementation for 'R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering'.☆12Apr 11, 2019Updated 6 years ago
- [WACV2023] Intention-Conditioned Long-Term Human Egocentric Action Forecasting @ EGO4D Challenge 2022☆14Sep 3, 2023Updated 2 years ago
- ☆14Dec 9, 2023Updated 2 years ago
- ☆35Jan 5, 2021Updated 5 years ago
- ☆64Jan 5, 2022Updated 4 years ago
- Official python implementation of R3-Transformer☆15Nov 30, 2020Updated 5 years ago
- MUREL (CVPR 2019), a multimodal relational reasoning module for VQA☆195Feb 9, 2020Updated 6 years ago
- Code for our paper: Learning Conditioned Graph Structures for Interpretable Visual Question Answering☆150Mar 11, 2019Updated 6 years ago
- Implementation for the CVPR2019 paper "Graphical Contrastive Losses for Scene Graph Generation"☆201Apr 2, 2020Updated 5 years ago
- External twitter feeder for AIL framework☆16Apr 16, 2023Updated 2 years ago
- ☆17Mar 13, 2023Updated 2 years ago
- Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos☆71Sep 7, 2021Updated 4 years ago
- Implementation of Soft-Label Chain Conditional Random Field for Phrase Grounding in PyTorch☆16Oct 21, 2022Updated 3 years ago
- [AAAI 2022 Oral] Static-Dynamic Co-Teaching for Class-Incremental 3D Object Detection☆25Nov 22, 2022Updated 3 years ago
- Disentangled Pre-training for Human-Object Interaction Detection☆27Sep 17, 2025Updated 4 months ago
- Code and data for the project "Visually grounded continual learning of compositional semantics"☆22Dec 27, 2022Updated 3 years ago
- Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering☆28Jul 1, 2024Updated last year
- A lightweight, scalable, and general framework for visual question answering research☆330Sep 3, 2021Updated 4 years ago
- Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''☆41Sep 9, 2019Updated 6 years ago
- Code for the model "Heterogeneous Graph Learning for Visual Commonsense Reasoning (NeurlPS 2019)"☆47Jul 27, 2020Updated 5 years ago
- A simple but well-performing "single-hop" visual attention model for the GQA dataset☆20Aug 8, 2019Updated 6 years ago
- A pytorch implementation for "A simple neural network module for relational reasoning", working on the CLEVR dataset☆90Nov 19, 2019Updated 6 years ago
- Code release for Energy-Based Learning for Scene Graph Genertaion☆94Apr 5, 2022Updated 3 years ago
- Deep Modular Co-Attention Networks for Visual Question Answering☆458Dec 16, 2020Updated 5 years ago
- the source code of Multi-modal Circulant Fusion (MCF) for Temporal Activity Localization☆23Mar 10, 2019Updated 6 years ago
- video captioning☆24Mar 14, 2019Updated 6 years ago
- BottomUpTopDown VQA model with question-type debiasing☆22Oct 6, 2019Updated 6 years ago
- Rethinking the Form of Latent States in Image Captioning☆20Aug 31, 2018Updated 7 years ago