bruno686 / VisPlayLinks
☆37Updated last month
Alternatives and similar repositories for VisPlay
Users that are interested in VisPlay are comparing it to the libraries listed below
Sorting:
- ☆57Updated last month
- Holistic Evaluation of Multimodal LLMs on Spatial Intelligence☆53Updated last week
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆111Updated 2 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆218Updated 5 months ago
- ☆112Updated 5 months ago
- [NeurIPS 2025] The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reason…☆152Updated 3 months ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆199Updated 2 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆125Updated 5 months ago
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆76Updated last month
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆82Updated 7 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆133Updated 4 months ago
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆51Updated 5 months ago
- ☆62Updated 3 months ago
- ☆48Updated this week
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆19Updated 6 months ago
- ☆161Updated last month
- Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"☆127Updated 2 weeks ago
- This is the offical repository of InfiniteVL☆65Updated 2 weeks ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆33Updated last year
- Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆85Updated 5 months ago
- ☆63Updated 5 months ago
- ☆65Updated last month
- A collection of awesome think with videos papers.☆74Updated last month
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆190Updated last week
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆88Updated 4 months ago
- ☆42Updated 6 months ago
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆36Updated 2 months ago
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆33Updated last month
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆172Updated 6 months ago
- ☆96Updated 6 months ago