VRU-NExT/VideoQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VRU-NExT/VideoQA)

VRU-NExT / VideoQA

☆104

Alternatives and similar repositories for VideoQA

Users that are interested in VideoQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sail-sg / VGT
View on GitHub
Video Graph Transformer for Video Question Answering (ECCV'22)
☆49Jun 8, 2023Updated 3 years ago
yl3800 / TranSTR
View on GitHub
☆12Dec 15, 2023Updated 2 years ago
doc-doc / NExT-QA
View on GitHub
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
☆189Aug 2, 2025Updated 11 months ago
Yui010206 / SeViLA
View on GitHub
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
☆198Jan 14, 2024Updated 2 years ago
doc-doc / HQGA
View on GitHub
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)
☆35Sep 17, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
showlab / mist
View on GitHub
☆37Dec 20, 2023Updated 2 years ago
yl3800 / EIGV
View on GitHub
☆15Aug 12, 2022Updated 3 years ago
antoyang / FrozenBiLM
View on GitHub
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
☆159Dec 9, 2024Updated last year
XLiu443 / Tem-adapter
View on GitHub
[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
☆37Oct 18, 2023Updated 2 years ago
xudejing / video-question-answering
View on GitHub
Video Question Answering via Gradually Refined Attention over Appearance and Motion
☆178Dec 5, 2017Updated 8 years ago
engindeniz / vitis
View on GitHub
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
☆13Jan 13, 2025Updated last year
yl3800 / IGV
View on GitHub
This repo contains code for Invariant Grounding for Video Question Answering
☆27Mar 2, 2023Updated 3 years ago
ByZ0e / Glance-Focus
View on GitHub
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
☆31Jun 28, 2024Updated 2 years ago
antoyang / just-ask
View on GitHub
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
☆127Sep 29, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
doc-doc / CoVGT
View on GitHub
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
☆20Mar 9, 2024Updated 2 years ago
thaolmk54 / hcrn-videoqa
View on GitHub
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
☆135Jul 25, 2024Updated last year
MichiganNLP / In-the-wild-QA
View on GitHub
In-the-wild Question Answering
☆15May 10, 2023Updated 3 years ago
Yui010206 / CREMA
View on GitHub
[ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
☆56Jul 1, 2025Updated last year
showlab / GEB-Plus
View on GitHub
[ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
☆17Aug 24, 2022Updated 3 years ago
CeeZh / LLoVi
View on GitHub
Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"
☆106Oct 27, 2024Updated last year
salesforce / BiST
View on GitHub
Code for the paper BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues (EMNLP20)
☆11Jun 16, 2025Updated last year
nguyentthong / video-language-understanding
View on GitHub
[ACL’24 Findings] Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
☆49May 12, 2026Updated 2 months ago
YangLiu9208 / CMCIR
View on GitHub
[IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
☆20Jul 6, 2023Updated 3 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
iwangjian / Color4Dial
View on GitHub
Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue (ACL Findings 2023)
☆21Nov 10, 2025Updated 8 months ago
wenhaochai / MovieChat
View on GitHub
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
☆704Jan 29, 2025Updated last year
csbobby / STAR_Benchmark
View on GitHub
☆36Apr 18, 2024Updated 2 years ago
wenhuchen / Meta-Module-Network
View on GitHub
Code for WACV 2021 Paper "Meta Module Network for Compositional Visual Reasoning"
☆43May 13, 2021Updated 5 years ago
sangminwoo / Temporal-Span-Proposal-Network-VidVRD
View on GitHub
[ESWA 2025] Official pytorch implementation of "What and When to look?: Temporal Span Proposal Network for Video Relation Detection"
☆16Aug 9, 2021Updated 4 years ago
MILVLG / activitynet-qa
View on GitHub
An VideoQA dataset based on the videos from ActivityNet
☆94Nov 22, 2020Updated 5 years ago
abwilf / Social-IQ-2.0-Challenge
View on GitHub
The Social-IQ 2.0 Challenge Release for the Artificial Social Intelligence Workshop at ICCV '23
☆38Oct 13, 2023Updated 2 years ago
mlvlab / Flipped-VQA
View on GitHub
Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)
☆77Mar 26, 2025Updated last year
mlvlab / vid-TLDR
View on GitHub
Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".
☆55Oct 21, 2025Updated 9 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Ziyang412 / UCoFiA
View on GitHub
Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)
☆66Jun 7, 2024Updated 2 years ago
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
cg1177 / Recursive-Multimodal-Agent
View on GitHub
☆19Jul 1, 2026Updated 3 weeks ago
scofield7419 / Video-of-Thought
View on GitHub
Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"
☆182Feb 25, 2025Updated last year
rvandeghen / elen0016-computer-vision-tutorial
View on GitHub
☆14Nov 6, 2024Updated last year
rvandeghen / ASTOD
View on GitHub
☆12Oct 13, 2023Updated 2 years ago
zhang-yu-wei / InBedder
View on GitHub
[ACL 2024] Source code for InBedder, an instruction-following text embedder
☆31Oct 11, 2024Updated last year