mbzuai-oryx/VideoMathQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mbzuai-oryx/VideoMathQA)

mbzuai-oryx / VideoMathQA

VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos

☆24

Alternatives and similar repositories for VideoMathQA

Users that are interested in VideoMathQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mbzuai-oryx / ARB
View on GitHub
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
☆17May 25, 2025Updated last year
HashmatShadab / HSAT
View on GitHub
[MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
☆12Jun 17, 2025Updated last year
mbzuai-oryx / TimeTravel
View on GitHub
[ACL 2025 🔥] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts
☆20May 22, 2025Updated last year
Amshaker / MAVOS
View on GitHub
[WACV 2025] Efficient Video Object Segmentation via Modulated Cross-Attention Memory
☆61Feb 28, 2025Updated last year
mbzuai-oryx / VideoMolmo
View on GitHub
Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"
☆56Jul 5, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hananshafi / MTL-ViT
View on GitHub
A new multi-task learning framework using Vision Transformers
☆11Jun 19, 2024Updated 2 years ago
mbzuai-oryx / DriveLMM-o1
View on GitHub
Reasoning DriveLMM
☆15Mar 15, 2025Updated last year
mbzuai-oryx / ClimateGPT
View on GitHub
[EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabi…
☆79Sep 24, 2024Updated last year
mbzuai-oryx / EvoLMM
View on GitHub
Self Evolving Large Multimodal Models with Continuous Rewards
☆25Jun 9, 2026Updated last month
HashmatShadab / APR
View on GitHub
(BMVC 2022--Oral) Official repository for "Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations" …
☆35Jan 8, 2023Updated 3 years ago
rohit901 / VANE-Bench
View on GitHub
[NAACL'25] Contains code and documentation for our VANE-Bench paper.
☆24Aug 19, 2025Updated 11 months ago
ShahinaKK / LWI-VMS
View on GitHub
Learnable Weight Initialization for Volumetric Medical Image Segmentation [Elsevier AIM2024]
☆22Oct 27, 2024Updated last year
Amshaker / GroupMamba
View on GitHub
[CVPR -2025] GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model
☆142Mar 22, 2025Updated last year
HashmatShadab / Robustness-of-Volumetric-Medical-Segmentation-Models
View on GitHub
[BMVC 2024] On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models
☆15Nov 1, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
asif-hanif / vafa
View on GitHub
[MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation"…
☆52Nov 14, 2023Updated 2 years ago
mbzuai-oryx / Video-CoM
View on GitHub
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
☆22Jun 17, 2026Updated last month
asif-hanif / baple
View on GitHub
[MICCAI 2024] Official code repository of paper titled "BAPLe: Backdoor Attacks on Medical Foundation Models using Prompt Learning" accep…
☆56Oct 22, 2024Updated last year
ShahinaKK / LG_SDG
View on GitHub
Language Grounded Single Source Domain Generalization in Medical Image Segmentation [ISBI2024]
☆33Oct 27, 2024Updated last year
mbzuai-oryx / VideoGLaMM
View on GitHub
[CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
☆104Apr 14, 2025Updated last year
akhtarvision / cal-detr
View on GitHub
☆42Nov 9, 2023Updated 2 years ago
akhtarvision / weather-regional
View on GitHub
☆11Oct 29, 2024Updated last year
HashmatShadab / Robust-LLaVA
View on GitHub
[ICCVW 2025 (Oral)] Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
☆29Oct 20, 2025Updated 9 months ago
hananshafi / MedContext
View on GitHub
[MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"
☆14Nov 1, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
HashmatShadab / MambaRobustness
View on GitHub
[CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"
☆26Jun 8, 2025Updated last year
mbzuai-oryx / ALM-Bench
View on GitHub
[CVPR 2025 🔥] ALM-Bench is a multilingual multi-modal diverse cultural benchmark for 100 languages across 19 categories. It assesses the…
☆47May 26, 2025Updated last year
mbzuai-oryx / CVRR-Evaluation-Suite
View on GitHub
[CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…
☆50Aug 23, 2024Updated last year
Razaimam45 / TTL-Test-Time-Low-Rank-Adaptation
View on GitHub
Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…
☆34May 11, 2025Updated last year
OmkarThawakar / composed-video-retrieval
View on GitHub
Composed Video Retrieval
☆62May 2, 2024Updated 2 years ago
abdohelmy / D-3Former
View on GitHub
Official repository of paper titled "D3Former: Debiased Dual Distilled Transformer for Incremental Learning".
☆25Jul 10, 2023Updated 3 years ago
hananshafi / llmblueprint
View on GitHub
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
☆85May 18, 2024Updated 2 years ago
mbzuai-oryx / XrayGPT
View on GitHub
[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.
☆529Aug 8, 2024Updated last year
BioMedIA-MBZUAI / MedPromptX
View on GitHub
☆71Jul 2, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
faresmalik / SEViT
View on GitHub
Source code for MICCAI 2022 paper entitled: 'Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification'
☆36Jan 13, 2023Updated 3 years ago
mbzuai-oryx / LongShOT
View on GitHub
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos
☆21Jun 20, 2026Updated last month
fahadshamshad / deep-facial-privacy-prior
View on GitHub
[ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".
☆12Oct 11, 2024Updated last year
umair1221 / WorldCache
View on GitHub
WorldCache: Content-Aware Caching for Accelerated Video World Models
☆21Jun 28, 2026Updated 3 weeks ago
Amshaker / Mobile-VideoGPT
View on GitHub
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
☆142Aug 6, 2025Updated 11 months ago
techmn / cosnet
View on GitHub
A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)
☆12Aug 11, 2025Updated 11 months ago
hanoonaR / object-centric-ovd
View on GitHub
[NeurIPS 2022] Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open-Vocabulary …
☆297Oct 12, 2022Updated 3 years ago