We-Math / We-Math2.0Links

The code and data of We-Math 2.0.

☆164

Alternatives and similar repositories for We-Math2.0

Users that are interested in We-Math2.0 are comparing it to the libraries listed below

Sorting:

agents-x-project / PyVision
[MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."
☆147Updated 6 months ago
callsys / GMPO
[ICLR 2026] Geometric-Mean Policy Optimization
☆99Updated last week
yihedeng9 / OpenVLThinker
OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement
☆129Updated 6 months ago
CSfufu / Revisual-R1
🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…
☆194Updated last month
kxfan2002 / SophiaVL-R1
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆91Updated 5 months ago
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆387Updated 5 months ago
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆180Updated 8 months ago
EvolvingLMMs-Lab / OpenMMReasoner
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
☆141Updated last month
VainF / Thinkless
[NeurIPS 2025] Thinkless: LLM Learns When to Think
☆250Updated 4 months ago
TEAM-ARM / arm
[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model
☆64Updated 3 months ago
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆109Updated 8 months ago
idanshen / Self-Distillation
☆70Updated last week
yannqi / R-4B
The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"
☆136Updated 5 months ago
si0wang / ThinkLite-VL
☆107Updated 7 months ago
We-Math / V-Thinker
☆169Updated 2 months ago
kokolerk / TON
[NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
☆53Updated 4 months ago
TsinghuaC3I / Unify-Post-Training
Towards a Unified View of Large Language Model Post-Training
☆200Updated 4 months ago
yxf203 / Awesome-Efficient-Agents
Survey and paper list on efficiency-guided LLM agents (memory, tool learning, planning).
☆154Updated last week
We-Math / We-Math
The code and data of We-Math, accepted by ACL 2025 main conference.
☆134Updated last month
MiniMax-AI / One-RL-to-See-Them-All
The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning
☆330Updated 8 months ago
lzhxmu / CPPO
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)
☆172Updated 3 months ago
thunlp / JustRL
☆230Updated last month
inclusionAI / dFactory
Easy and Efficient dLLM Fine-Tuning
☆208Updated 2 weeks ago
DAMO-NLP-SG / multimodal_textbook
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆191Updated 10 months ago
xufangzhi / Genius
[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework
☆71Updated 8 months ago
microsoft / x-reasoner
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
☆50Updated 8 months ago
zli12321 / Vision-SR1
Reinforcement Learning of Vision Language Models with Self Visual Perception Reward
☆159Updated 4 months ago
beichenzbc / BoostStep
official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"
☆37Updated last year
NVlabs / GDPO
Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
☆349Updated 3 weeks ago
Tencent / llm.hunyuan.T1
☆82Updated 10 months ago