Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks
☆37Nov 27, 2025Updated 3 months ago
Alternatives and similar repositories for Agent-X
Users that are interested in Agent-X are comparing it to the libraries listed below
Sorting:
- A new multi-task learning framework using Vision Transformers☆11Jun 19, 2024Updated last year
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 9 months ago
- ☆11Oct 29, 2024Updated last year
- Neural ODE Transformers (ICLR 2025)☆17Sep 6, 2025Updated 6 months ago
- VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos☆22Jan 26, 2026Updated last month
- [NAACL'25] Contains code and documentation for our VANE-Bench paper.☆23Aug 19, 2025Updated 6 months ago
- ☆72Jul 20, 2025Updated 7 months ago
- ☆14Nov 26, 2021Updated 4 years ago
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆50Aug 23, 2024Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆80Jun 17, 2024Updated last year
- [Findings of EMNLP'2024] Unified Active Retrieval for Retrieval Augmented Generation☆23Sep 30, 2024Updated last year
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]☆34Sep 1, 2025Updated 6 months ago
- [CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"☆26Jun 8, 2025Updated 8 months ago
- Benchmark and model for step-by-step reasoning in autonomous driving.☆69Mar 15, 2025Updated 11 months ago
- MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots☆78Dec 5, 2025Updated 3 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109May 27, 2025Updated 9 months ago
- ☆51Jul 31, 2025Updated 7 months ago
- [CVPR 2026] DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning☆83Feb 21, 2026Updated 2 weeks ago
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes 🚀🚀🚀☆37Jan 21, 2025Updated last year
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆183Jun 5, 2025Updated 9 months ago
- ☆57Feb 2, 2026Updated last month
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆88Feb 15, 2025Updated last year
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆106Sep 18, 2025Updated 5 months ago
- ☆40Jul 26, 2024Updated last year
- Hierarchical Vision Transformers for Disease Progression Detection in Chest X-Ray Images☆11Jan 11, 2024Updated 2 years ago
- Official implementation of "Meta-Entity Driven Triplet Mining for Aligning Medical Vision-Language Models"☆14Mar 19, 2025Updated 11 months ago
- ☆25Aug 19, 2025Updated 6 months ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- ☆15Feb 12, 2026Updated 3 weeks ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆43Jul 19, 2024Updated last year
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- Project focused on enhancing the quality of low-fidelity endoscopy images using Generative Adversarial Networks (GANs) implemented in PyT…☆17Jun 5, 2025Updated 9 months ago
- ☆12Sep 23, 2022Updated 3 years ago
- [ICCV2025] Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation☆14Feb 18, 2026Updated 2 weeks ago
- Official implementation of "Attention-aware semantic communications for collaborative inference” (IEEE IoTJ 2024)☆13Jan 22, 2026Updated last month
- [CVPR 2021] FMO Deblurring Benchmark☆13Jan 12, 2022Updated 4 years ago
- ☆10Oct 5, 2022Updated 3 years ago
- Communication Relay by creating a WiFi Mesh Network using ROS, and using that network for Data Telemetry, with Telemetry radios ( Ubiquit…☆11Dec 18, 2018Updated 7 years ago