aurooj/SHG-VQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aurooj/SHG-VQA)

aurooj / SHG-VQA

Learning Situation Hyper-Graphs for Video Question Answering

☆23

Alternatives and similar repositories for SHG-VQA

Users that are interested in SHG-VQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

aioz-ai / CFR_VQA
View on GitHub
Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)
☆48Apr 22, 2026Updated 3 months ago
ByZ0e / Glance-Focus
View on GitHub
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
☆31Jun 28, 2024Updated 2 years ago
jingchenchen / ReasoningConsistency-VQA
View on GitHub
☆13Aug 14, 2022Updated 3 years ago
yl3800 / TranSTR
View on GitHub
☆12Dec 15, 2023Updated 2 years ago
traveler-framework / TraveLER
View on GitHub
[EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering
☆18Oct 31, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zhang-can / UP-TAL
View on GitHub
[CVPR2022] Unsupervised Pre-training for Temporal Action Localization Tasks (UP-TAL)
☆29Mar 9, 2022Updated 4 years ago
jialinwu17 / MAVEX
View on GitHub
☆30Dec 16, 2022Updated 3 years ago
JingweiJ / ActionGenome
View on GitHub
A video database bridging human actions and human-object relationships
☆165Jun 30, 2020Updated 6 years ago
WissingChen / CRA-GQA
View on GitHub
The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"
☆52Apr 27, 2025Updated last year
sail-sg / VGT
View on GitHub
Video Graph Transformer for Video Question Answering (ECCV'22)
☆49Jun 8, 2023Updated 3 years ago
ShiYaya / emscore
View on GitHub
Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"
☆26Oct 20, 2022Updated 3 years ago
Hokhim2 / CVBench
View on GitHub
☆19Aug 28, 2025Updated 11 months ago
cvlab-columbia / DoubleRight
View on GitHub
☆27Jan 25, 2024Updated 2 years ago
mlvlab / Flipped-VQA
View on GitHub
Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)
☆77Mar 26, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
XLiu443 / Tem-adapter
View on GitHub
[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
☆37Oct 18, 2023Updated 2 years ago
showlab / mist
View on GitHub
☆37Dec 20, 2023Updated 2 years ago
jiangchaokang / NeuroGauss4D-PCI
View on GitHub
NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation
☆20May 27, 2024Updated 2 years ago
doc-doc / CoVGT
View on GitHub
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
☆20Mar 9, 2024Updated 2 years ago
zhangxi1997 / VQACL
View on GitHub
VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)
☆45Mar 28, 2024Updated 2 years ago
csbobby / STAR_Benchmark
View on GitHub
☆36Apr 18, 2024Updated 2 years ago
aalto-intelligent-robotics / llm-trajectory-prediction
View on GitHub
Exploring Large Language Models for Trajectory Prediction: A Technical Perspective
☆29Jun 12, 2024Updated 2 years ago
sayaknag / unbiasedSGG
View on GitHub
Official Pytorch Implementation of the framework TEMPURA proposed in our paper Unbiased Scene Graph Generation in Videos accepted by CVPR…
☆25Sep 9, 2025Updated 10 months ago
JasonCodeMaker / CTVR
View on GitHub
☆16Jun 2, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
layer6ai-labs / ASL
View on GitHub
Code for CVPR'21 paper "Weakly Supervised Action Selection Learning in Video"
☆24Apr 1, 2021Updated 5 years ago
YuJungHeo / kbvqa-public
View on GitHub
☆40Nov 29, 2022Updated 3 years ago
kj3moraes / movieclip
View on GitHub
An experiment with movie scenes and contrastive learning
☆11Feb 1, 2025Updated last year
ngl567 / DHNS
View on GitHub
DASFAA 2025: Diffusion-based Hierarchical Negative Sampling for Multimodal Knowledge Graph Completion
☆19Feb 17, 2025Updated last year
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
lwpyh / CoS_codes
View on GitHub
CoS: Chain-of-Shot Prompting for Long Video Understanding
☆53Feb 13, 2025Updated last year
mlvlab / DeepVideoR1
View on GitHub
[NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"
☆37Feb 22, 2026Updated 5 months ago
zhengrongz / AoTD
View on GitHub
[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".
☆58Updated this week
snumprlab / isr-dpo
View on GitHub
Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)
☆23Nov 25, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
NJU-LINK / IF-VidCap
View on GitHub
The Source Code for IF-VidCap @ICLR 2026
☆19Oct 22, 2025Updated 9 months ago
HAWLYQ / ET-Cap
View on GitHub
☆24Oct 8, 2023Updated 2 years ago
rentainhe / TRAR-VQA
View on GitHub
[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
☆68Oct 11, 2021Updated 4 years ago
AslanDing / Robust-Fidelity
View on GitHub
a robust metric (robust fidelity) for XGNN (ICLR24)
☆12Jun 3, 2025Updated last year
jayleicn / TVRetrieval
View on GitHub
[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
☆163May 28, 2024Updated 2 years ago
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 11 months ago
DCDmllm / Momentor
View on GitHub
☆81Nov 24, 2024Updated last year