wangqinsi1 / Vision-ZeroLinks
This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.
☆101Updated last month
Alternatives and similar repositories for Vision-Zero
Users that are interested in Vision-Zero are comparing it to the libraries listed below
Sorting:
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆125Updated 3 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆137Updated 4 months ago
- Geometric-Mean Policy Optimization☆95Updated 3 weeks ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆122Updated 4 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 6 months ago
- ☆226Updated 9 months ago
- [EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆87Updated 5 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆242Updated 2 months ago
- ☆116Updated this week
- ☆335Updated last month
- ☆185Updated 2 weeks ago
- ☆41Updated 6 months ago
- Demystifying Reinforcement Learning in Agentic Reasoning☆126Updated last month
- The code and data of We-Math 2.0.☆162Updated 3 months ago
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆81Updated last month
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆49Updated 6 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆237Updated 3 weeks ago
- ☆85Updated 8 months ago
- ☆56Updated last year
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆352Updated 5 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆73Updated last year
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆289Updated last month
- ☆68Updated 2 months ago
- ☆142Updated 7 months ago
- Visual Planning: Let's Think Only with Images☆283Updated 6 months ago
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…☆289Updated 3 weeks ago
- Reinforcement Learning of Vision Language Models with Self Visual Perception Reward☆146Updated 2 months ago
- RLP: Reinforcement as a Pretraining Objective☆205Updated 2 months ago
- ☆19Updated 9 months ago
- (ACL-2025 main conference) Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback☆36Updated 5 months ago