bruno686 / VisPlayLinks
☆28Updated 3 weeks ago
Alternatives and similar repositories for VisPlay
Users that are interested in VisPlay are comparing it to the libraries listed below
Sorting:
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆75Updated 3 weeks ago
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆106Updated last month
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆151Updated last month
- ☆42Updated 6 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆205Updated 4 months ago
- ☆108Updated 4 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆123Updated 4 months ago
- ☆64Updated 5 months ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆191Updated 2 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆79Updated 6 months ago
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆33Updated 2 weeks ago
- ☆55Updated last month
- Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"☆120Updated last month
- Holistic Evaluation of Multimodal LLMs on Spatial Intelligence☆46Updated this week
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆52Updated 4 months ago
- Official implementation of "Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology"☆71Updated last month
- TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models☆61Updated 2 weeks ago
- ☆65Updated last month
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆80Updated 5 months ago
- [NeurIPS 2025] The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reason…☆150Updated 3 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆87Updated 4 months ago
- Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆82Updated 4 months ago
- ☆23Updated 7 months ago
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆33Updated last month
- DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning☆163Updated 3 weeks ago
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆177Updated 2 weeks ago
- [arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps☆70Updated last month
- Visual Planning: Let's Think Only with Images☆284Updated 6 months ago
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆36Updated 5 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆131Updated 3 months ago