Video Question Answering | Video QA | VQA
☆91Nov 17, 2025Updated 4 months ago
Alternatives and similar repositories for Video-Question-Answering_Resources
Users that are interested in Video-Question-Answering_Resources are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Feb 26, 2024Updated 2 years ago
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆10Jul 22, 2024Updated last year
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆16Jan 31, 2024Updated 2 years ago
- This repository contains code for AAAI2025 paper "Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal …☆23Aug 18, 2025Updated 7 months ago
- LMM for VQA, tcsvt version☆10Jul 19, 2024Updated last year
- ☆17Aug 11, 2023Updated 2 years ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- Courbariaux, Matthieu, Yoshua Bengio, and Jean-Pierre David. "Binaryconnect: Training deep neural networks with binary weights during pro…☆12Aug 31, 2020Updated 5 years ago
- [EMNLP 2024] A Video Chat Agent with Temporal Prior☆32Mar 2, 2025Updated last year
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] 🎞️ LVNet.☆42Feb 10, 2026Updated last month
- Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception☆19May 7, 2024Updated last year
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆55May 25, 2025Updated 9 months ago
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- Repository of PIXAR, a Pixel-based Auto-Regressive Language Model☆18Sep 15, 2025Updated 6 months ago
- ☆12Dec 15, 2023Updated 2 years ago
- [3DV 2025] VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition☆18Mar 18, 2025Updated last year
- Software Engineering Back End Microservices Project☆15Nov 20, 2024Updated last year
- [IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering☆20Jul 6, 2023Updated 2 years ago
- TTRV: Test-Time Reinforcement Learning for Vision–Language Models (CVPR 2026)☆37Mar 8, 2026Updated 2 weeks ago
- AgenTracer: A Lightweight Failure Attributor for Agentic Systems☆81Nov 12, 2025Updated 4 months ago
- 中国历年GDP和人口数据可视化☆13Jan 18, 2023Updated 3 years ago
- Implementation of the DocLLM paper for Llama models.☆13Apr 6, 2025Updated 11 months ago
- Implementation of the paper: "BRAVE : Broadening the visual encoding of vision-language models"☆26Updated this week
- [CVPR 2026] UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models☆37Feb 21, 2026Updated last month
- ☆27Feb 12, 2026Updated last month
- "FusionFactory: Fusing LLM Capabilities with Routing Data", Tao Feng, Haozhen Zhang, Zijie Lei, Pengrui Han, Mostofa Patwary, Mohammad Sh…☆19Dec 30, 2025Updated 2 months ago
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Jan 1, 2026Updated 2 months ago
- [CVPR 2026] Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding☆73Updated this week
- [ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…☆21Jul 26, 2025Updated 7 months ago
- Github Repo for ICML 2022 paper: Communication-Efficient Adaptive Federated Learning☆10Nov 18, 2022Updated 3 years ago
- ☆15Aug 12, 2022Updated 3 years ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆84Jul 1, 2024Updated last year
- ☆14Oct 6, 2024Updated last year
- [NeurIPS'24 spotlight] MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning. [TPAMI'25] MECD+☆47Feb 11, 2026Updated last month
- ☆12Nov 3, 2023Updated 2 years ago
- Code for AISTATS'25 paper - On the Power of Adaptive Weighted Aggregation in Heterogeneous Federated Learning and Beyond☆13Sep 23, 2025Updated 6 months ago
- A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning☆36Mar 12, 2026Updated last week
- ☆17May 18, 2024Updated last year
- ☆33Aug 17, 2025Updated 7 months ago