MILVLG/activitynet-qa

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MILVLG/activitynet-qa)

MILVLG / activitynet-qa

An VideoQA dataset based on the videos from ActivityNet

☆94

Alternatives and similar repositories for activitynet-qa

Users that are interested in activitynet-qa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xudejing / video-question-answering
View on GitHub
Video Question Answering via Gradually Refined Attention over Appearance and Motion
☆178Dec 5, 2017Updated 8 years ago
YunseokJANG / tgif-qa
View on GitHub
Repository for our CVPR 2017 and IJCV: TGIF-QA
☆180Sep 6, 2021Updated 4 years ago
doc-doc / NExT-QA
View on GitHub
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
☆189Aug 2, 2025Updated 11 months ago
doc-doc / NExT-OE
View on GitHub
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
☆30Jul 18, 2023Updated 3 years ago
jayleicn / TVQA
View on GitHub
[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
☆181Oct 25, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
thaolmk54 / hcrn-videoqa
View on GitHub
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
☆135Jul 25, 2024Updated last year
fanchenyou / HME-VideoQA
View on GitHub
Heterogeneous Memory Enhanced Multimodal Attention Model for VideoQA
☆55Sep 13, 2021Updated 4 years ago
antoyang / just-ask
View on GitHub
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
☆127Sep 29, 2023Updated 2 years ago
jayleicn / TVQAplus
View on GitHub
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
☆132Oct 25, 2022Updated 3 years ago
yj-yu / lsmdc
View on GitHub
☆33Nov 12, 2018Updated 7 years ago
egoschema / EgoSchema
View on GitHub
☆117Dec 30, 2024Updated last year
facebookresearch / corefnmn
View on GitHub
Visual Coreference Resolution in Visual Dialog using Neural Module Networks
☆58Oct 12, 2021Updated 4 years ago
lbaermann / qaego4d
View on GitHub
Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"
☆31Aug 28, 2023Updated 2 years ago
MILVLG / mt-captioning
View on GitHub
A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning
☆25Sep 4, 2020Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
VisionLearningGroup / Text-to-Clip_Retrieval
View on GitHub
Implementation for "Multilevel Language and Vision Integration for Text-to-Clip Retrieval"
☆49Jan 21, 2019Updated 7 years ago
llyx97 / TempCompass
View on GitHub
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆133Apr 4, 2025Updated last year
gramuah / pose-errors
View on GitHub
Pose Estimation Errors, the Ultimate Diagnosis
☆10Apr 22, 2021Updated 5 years ago
lscpku / VITATECS
View on GitHub
☆18Jul 10, 2024Updated 2 years ago
ozgyal / ActivityNet-Video-Downloader
View on GitHub
This simple script is for downloading videos of ActivityNet dataset by parsing URLs from given .json file.
☆21Nov 30, 2017Updated 8 years ago
bupt-cist / vqa-playground-pytorch
View on GitHub
Code for NIPS 2018 paper, "Chain of Reasoning for Visual Question Answering"
☆28Nov 23, 2018Updated 7 years ago
waybarrios / guidance-based-video-grounding
View on GitHub
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
☆23Sep 26, 2024Updated last year
agakshat / visualdialog-pytorch
View on GitHub
Community Regularization of Visually Grounded Dialog https://arxiv.org/abs/1808.04359
☆15May 16, 2019Updated 7 years ago
csbobby / STAR_Benchmark
View on GitHub
☆36Apr 18, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
jimmy646 / violin
View on GitHub
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
☆161Apr 29, 2020Updated 6 years ago
abwilf / Social-IQ-2.0-Challenge
View on GitHub
The Social-IQ 2.0 Challenge Release for the Artificial Social Intelligence Workshop at ICCV '23
☆38Oct 13, 2023Updated 2 years ago
LisaAnne / TemporalLanguageRelease
View on GitHub
☆44Mar 8, 2021Updated 5 years ago
JUNJIE99 / MLVU
View on GitHub
🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
☆263Apr 13, 2026Updated 3 months ago
gicheonkang / dan-visdial
View on GitHub
✨ Official PyTorch Implementation for EMNLP'19 Paper, "Dual Attention Networks for Visual Reference Resolution in Visual Dialog"
☆44Mar 19, 2023Updated 3 years ago
satwikkottur / clevr-dialog
View on GitHub
Repository to generate CLEVR-Dialog: A diagnostic dataset for Visual Dialog
☆50Feb 18, 2020Updated 6 years ago
facebookresearch / grounded-video-description
View on GitHub
Video Grounding and Captioning
☆331Oct 12, 2021Updated 4 years ago
jayleicn / ClipBERT
View on GitHub
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…
☆730Aug 8, 2023Updated 2 years ago
jokieleung / awesome-visual-question-answering
View on GitHub
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Common…
☆672Jul 6, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
CogComp / Salient-Event-Detection
View on GitHub
The repository for the paper "Is Killed More Significant than Fled? A Contextual Model for Salient Event Detection"
☆10Jul 5, 2022Updated 4 years ago
yaohungt / Gated-Spatio-Temporal-Energy-Graph
View on GitHub
[CVPR'19] [PyTorch] Gated Spatio Temporal Energy Graph
☆153Feb 20, 2020Updated 6 years ago
Yui010206 / SeViLA
View on GitHub
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
☆198Jan 14, 2024Updated 2 years ago
google-deepmind / perception_test
View on GitHub
☆253Jun 19, 2026Updated last month
shengyuzhang / DeVLBert
View on GitHub
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
☆27Nov 27, 2022Updated 3 years ago
maurya-rohit / Scene-Graph-For-Videos
View on GitHub
☆15Aug 20, 2024Updated last year