VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos
β24May 7, 2026Updated last month
Alternatives and similar repositories for VideoMathQA
Users that are interested in VideoMathQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ARB: A Comprehensive Arabic Multimodal Reasoning Benchmarkβ17May 25, 2025Updated last year
- [ACL 2025 π₯] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifactsβ19May 22, 2025Updated last year
- A new multi-task learning framework using Vision Transformersβ11Jun 19, 2024Updated 2 years ago
- Learnable Weight Initialization for Volumetric Medical Image Segmentation [Elsevier AIM2024]β22Oct 27, 2024Updated last year
- Self Evolving Large Multimodal Models with Continuous Rewardsβ24Jun 9, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [BMVC 2024] On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Modelsβ15Nov 1, 2024Updated last year
- Language Grounded Single Source Domain Generalization in Medical Image Segmentation [ISBI2024]β33Oct 27, 2024Updated last year
- [NAACL'25] Contains code and documentation for our VANE-Bench paper.β24Aug 19, 2025Updated 10 months ago
- AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understandingβ¦β54Mar 13, 2025Updated last year
- [ICCVW 2025 (Oral)] Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Modelsβ29Oct 20, 2025Updated 8 months ago
- [MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"β14Nov 1, 2024Updated last year
- [CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"β26Jun 8, 2025Updated last year
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite foβ¦β50Aug 23, 2024Updated last year
- (BMVC 2022--Oral) Official repository for "Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations" β¦β35Jan 8, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation"β¦β52Nov 14, 2023Updated 2 years ago
- [WACV 2025] Efficient Video Object Segmentation via Modulated Cross-Attention Memoryβ61Feb 28, 2025Updated last year
- [EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabiβ¦β79Sep 24, 2024Updated last year
- A codeβ29Jan 23, 2025Updated last year
- [CVPR -2025] GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Modelβ142Mar 22, 2025Updated last year
- [MICCAI 2024] Official code repository of paper titled "BAPLe: Backdoor Attacks on Medical Foundation Models using Prompt Learning" accepβ¦β56Oct 22, 2024Updated last year
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".β12Oct 11, 2024Updated last year
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)β12Aug 11, 2025Updated 10 months ago
- [CVPR 2025 π₯] ALM-Bench is a multilingual multi-modal diverse cultural benchmark for 100 languages across 19 categories. It assesses theβ¦β46May 26, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR 2025 π₯]A Large Multimodal Model for Pixel-Level Visual Grounding in Videosβ104Apr 14, 2025Updated last year
- ICLR 2026: Agent-X Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasksβ43Apr 28, 2026Updated 2 months ago
- [NAACL 2025 π₯] CAMEL-Bench is an Arabic benchmark for evaluating multimodal models across eight domains with 29,000 questions.β38Apr 17, 2025Updated last year
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes πππβ37Jan 21, 2025Updated last year
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challengesβ30Sep 24, 2023Updated 2 years ago
- β42Nov 9, 2023Updated 2 years ago
- Official repository for "Boosting Adversarial Transferability using Dynamic Cues " (ICLR 2023)β20Aug 24, 2023Updated 2 years ago
- [MICCAI 2023][Early Accept] Official code repository of paper titled "Cross-modulated Few-shot Image Generation for Colorectal Tissue Claβ¦β47Sep 28, 2023Updated 2 years ago
- [BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.β532Aug 8, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of β¦β68Dec 3, 2023Updated 2 years ago
- Composed Video Retrievalβ62May 2, 2024Updated 2 years ago
- [IEEE TMI 2025] MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retentionβ19Dec 15, 2025Updated 6 months ago
- [MICCAI 2024 π₯] HLSS, the first study to explore hierarchical information inherent in histopathology images and their language descriptiβ¦β27Aug 5, 2024Updated last year
- [CVPR 2026 π₯] ThinkGeo is a Comprehensive Benchmark to evaluate Tool-Augmented Agents for Remote Sensing Tasksβ72May 29, 2026Updated last month
- Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"β56Jul 5, 2025Updated 11 months ago
- [CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Apβ¦β416Jul 25, 2023Updated 2 years ago