WissingChen/CRA-GQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WissingChen/CRA-GQA)

WissingChen / CRA-GQA

The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"

☆52

Alternatives and similar repositories for CRA-GQA

Users that are interested in CRA-GQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YangLiu9208 / VisionGRU
View on GitHub
VisionGRU: A Linear-Complexity RNN Model for Efficient Image Analysis
☆13Dec 26, 2024Updated last year
HCPLab-SYSU / DDP-WM
View on GitHub
DDP-WM: Disentangled Dynamics Prediction for Efficient World Models (ICML-26)
☆19Mar 4, 2026Updated 4 months ago
HCPLab-SYSU / DART
View on GitHub
DART: Differentiable Adaptive Region Tokenizer for Vision Foundation Models
☆22Oct 13, 2025Updated 9 months ago
LZ-CH / DSPNet
View on GitHub
The official repository of [CVPR2025] DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering
☆28Apr 18, 2025Updated last year
tychen-SJTU / MECD-Benchmark
View on GitHub
[NeurIPS'24 spotlight] MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning. [TPAMI'25] MECD+
☆50Feb 11, 2026Updated 5 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
YangLiu9208 / CMCIR
View on GitHub
[IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
☆20Jul 6, 2023Updated 3 years ago
fansunqi / AKeyS
View on GitHub
Agentic Keyframe Search for Video Question Answering
☆18Jun 30, 2026Updated 3 weeks ago
HCPLab-SYSU / CMCIR
View on GitHub
[IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
☆78Jul 6, 2023Updated 3 years ago
aurooj / SHG-VQA
View on GitHub
Learning Situation Hyper-Graphs for Video Question Answering
☆23Feb 16, 2024Updated 2 years ago
YangLiu9208 / TCGL
View on GitHub
[IEEE T-IP 2022] TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning
☆24Dec 19, 2023Updated 2 years ago
ZijiaLewisLu / CVPR2025-DeCafNet
View on GitHub
Official Repo for CVPR 2025 Paper -- DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
☆17Mar 16, 2026Updated 4 months ago
HCPLab-SYSU / TAVP
View on GitHub
Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation (CVPR-26)
☆25May 19, 2026Updated 2 months ago
wbfwonderful / Vad-R1
View on GitHub
[NeurIPS 2025]Official repositories for "Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought".
☆31Jan 30, 2026Updated 5 months ago
WissingChen / CMCRL
View on GitHub
The official implementation of “Cross-Modal Causal Representation Learning for Radiology Report Generation” （IEEE T-IP 2025）
☆68May 27, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
YangLiu9208 / CDFAG
View on GitHub
Transferable Feature Representation for Visible-to-Infrared Cross-Dataset Human Action Recognition (Complexity 2018)
☆13Dec 14, 2022Updated 3 years ago
showlab / mist
View on GitHub
☆37Dec 20, 2023Updated 2 years ago
minghangz / SPL
View on GitHub
Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization
☆16Jul 20, 2023Updated 3 years ago
doc-doc / NExT-GQA
View on GitHub
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
☆89Jul 1, 2024Updated 2 years ago
guikunchen / SDSGG
View on GitHub
[NeurIPS'24] Scene Graph Generation with Role-Playing Large Language Models
☆15Oct 10, 2025Updated 9 months ago
ByZ0e / Glance-Focus
View on GitHub
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
☆31Jun 28, 2024Updated 2 years ago
zhengrongz / AoTD
View on GitHub
[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".
☆58Updated this week
joslefaure / HERMES
View on GitHub
[ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics
☆37Sep 10, 2025Updated 10 months ago
Ziyang412 / VideoTree
View on GitHub
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆166Jun 23, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Zhuo-Cao / FlashVTG
View on GitHub
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)
☆39Apr 17, 2025Updated last year
hshjerry / VideoEspresso
View on GitHub
[CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
☆140Jul 28, 2025Updated last year
zhoujiahuan1991 / CVPR2025-STOP
View on GitHub
☆19May 8, 2025Updated last year
yl3800 / IGV
View on GitHub
This repo contains code for Invariant Grounding for Video Question Answering
☆27Mar 2, 2023Updated 3 years ago
mlvlab / DeepVideoR1
View on GitHub
[NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"
☆37Feb 22, 2026Updated 5 months ago
lwpyh / CoS_codes
View on GitHub
CoS: Chain-of-Shot Prompting for Long Video Understanding
☆53Feb 13, 2025Updated last year
YangLiu9208 / SAKDN
View on GitHub
[IEEE T-IP 2021] Semantics-aware Adaptive Knowledge Distillation for Cross-modal Action Recognition
☆29Jan 6, 2025Updated last year
HuiGuanLab / RaTSG
View on GitHub
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆13Aug 22, 2025Updated 11 months ago
gyxxyg / VTG-LLM
View on GitHub
[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
☆130Dec 10, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
qirui-chen / RGA3-release
View on GitHub
[ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring
☆24Aug 8, 2025Updated 11 months ago
yl3800 / EIGV
View on GitHub
☆15Aug 12, 2022Updated 3 years ago
HengLan / TA-STVG
View on GitHub
[ICLR 2025] Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
☆44Mar 18, 2025Updated last year
chakravarthi589 / Video-Question-Answering_Resources
View on GitHub
Video Question Answering | Video QA | VQA
☆97Jun 12, 2026Updated last month
qirui-chen / MultiHop-EgoQA
View on GitHub
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆38May 27, 2025Updated last year
WHB139426 / Grounded-Video-LLM
View on GitHub
[EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆149Aug 21, 2025Updated 11 months ago
Ziyang412 / UCoFiA
View on GitHub
Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)
☆66Jun 7, 2024Updated 2 years ago