aurooj / SHG-VQAView external linksLinks
Learning Situation Hyper-Graphs for Video Question Answering
☆22Feb 16, 2024Updated last year
Alternatives and similar repositories for SHG-VQA
Users that are interested in SHG-VQA are comparing it to the libraries listed below
Sorting:
- [CVPR2022] Unsupervised Pre-training for Temporal Action Localization Tasks (UP-TAL)☆29Mar 9, 2022Updated 3 years ago
- Agentic Keyframe Search for Video Question Answering☆15Apr 7, 2025Updated 10 months ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated 3 weeks ago
- ☆12Dec 15, 2023Updated 2 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)☆48Nov 3, 2022Updated 3 years ago
- This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)☆31Jun 28, 2024Updated last year
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆16Oct 31, 2024Updated last year
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆42Apr 27, 2025Updated 9 months ago
- ☆30Dec 16, 2022Updated 3 years ago
- ☆13Aug 14, 2022Updated 3 years ago
- NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation☆19May 27, 2024Updated last year
- Implementation of warehouse_ros using MongoDB☆17Oct 13, 2024Updated last year
- ☆11Mar 4, 2021Updated 4 years ago
- DASFAA 2025: Diffusion-based Hierarchical Negative Sampling for Multimodal Knowledge Graph Completion☆17Feb 17, 2025Updated 11 months ago
- Official code for the ICLR2023 paper Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection☆43Jun 4, 2024Updated last year
- Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)☆77Mar 26, 2025Updated 10 months ago
- [ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer☆37Oct 18, 2023Updated 2 years ago
- ☆22Mar 7, 2025Updated 11 months ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆44Mar 28, 2024Updated last year
- ☆80Nov 24, 2024Updated last year
- Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 20…☆18Apr 23, 2024Updated last year
- This is an official PyTorch Implementation of Neighbor Relations Matter in Video Scene Detection.☆28Mar 19, 2025Updated 10 months ago
- ☆40Nov 29, 2022Updated 3 years ago
- [ECCV 22] LocVTP: Video-Text Pre-training for Temporal Localization☆39Jul 29, 2022Updated 3 years ago
- [2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval☆42Sep 23, 2021Updated 4 years ago
- Video Graph Transformer for Video Question Answering (ECCV'22)☆49Jun 8, 2023Updated 2 years ago
- [ICCV 2025] Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning.☆49Dec 10, 2025Updated 2 months ago
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)☆19Mar 9, 2024Updated last year
- ☆24Oct 8, 2023Updated 2 years ago
- CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment☆22Apr 15, 2022Updated 3 years ago
- Implementation of our IJCAI2022 oral paper, ER-SAN: Enhanced-Adaptive Relation Self-Attention Network for Image Captioning.☆24Aug 5, 2023Updated 2 years ago
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆23Nov 25, 2025Updated 2 months ago
- Official Pytorch Implementation of the framework TEMPURA proposed in our paper Unbiased Scene Graph Generation in Videos accepted by CVPR…☆24Sep 9, 2025Updated 5 months ago
- [AAAI 2021] Confidence-aware Non-repetitive Multimodal Transformers for TextCaps☆24Mar 29, 2023Updated 2 years ago
- [Findings of EMNLP 2022] AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant☆23Sep 11, 2023Updated 2 years ago
- Official Repository for CVPR 2022 paper "REX: Reasoning-aware and Grounded Explanation"☆22Nov 21, 2023Updated 2 years ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆69Oct 11, 2021Updated 4 years ago