huggingface / Math-VerifyLinks

☆971

Alternatives and similar repositories for Math-Verify

Users that are interested in Math-Verify are comparing it to the libraries listed below

Sorting:

RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆739Updated 2 months ago
THUDM / ReST-MCTS
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆671Updated 9 months ago
princeton-nlp / SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
☆923Updated 8 months ago
magpie-align / magpie
[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …
☆782Updated 7 months ago
SimpleBerry / LLaMA-O1
Large Reasoning Models
☆805Updated 10 months ago
zhentingqi / rStar
☆963Updated 8 months ago
allenai / reward-bench
RewardBench: the first evaluation tool for reward models.
☆642Updated 4 months ago
GAIR-NLP / LIMO
[COLM 2025] LIMO: Less is More for Reasoning
☆1,037Updated 2 months ago
ZubinGou / math-evaluation-harness
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
☆258Updated last year
ContextualAI / HALOs
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
☆889Updated 3 weeks ago
lqtrung1998 / mwp_ReFT
☆548Updated 9 months ago
TIGER-AI-Lab / verl-tool
A version of verl to support diverse tool use
☆607Updated this week
MARIO-Math-Reasoning / Super_MARIO
☆342Updated 4 months ago
sail-sg / understand-r1-zero
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,126Updated last month
sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆539Updated this week
Open-Reasoner-Zero / Open-Reasoner-Zero
Official Repo for Open-Reasoner-Zero
☆2,054Updated 4 months ago
NovaSky-AI / SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
☆1,060Updated this week
princeton-nlp / LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
☆496Updated last year
eddycmu / demystify-long-cot
☆323Updated 4 months ago
BytedTsinghua-SIA / DAPO
An Open-source RL System from ByteDance Seed and Tsinghua AIR
☆1,597Updated 5 months ago
NVIDIA-NeMo / Skills
A project to improve skills of large language models
☆587Updated this week
Eclipsess / Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
☆651Updated this week
NVIDIA / NeMo-Aligner
Scalable toolkit for efficient model alignment
☆842Updated 2 weeks ago
RLHFlow / RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
☆1,463Updated 5 months ago
PRIME-RL / PRIME
Scalable RL solution for advanced reasoning of language models
☆1,751Updated 7 months ago
dvlab-research / Step-DPO
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
☆384Updated 9 months ago
THUDM / LongBench
LongBench v2 and LongBench (ACL 25'&24')
☆992Updated 9 months ago
THUDM / slime
slime is an LLM post-training framework for RL Scaling.
☆2,170Updated this week
jzhang38 / EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
☆747Updated last year
Qihoo360 / Light-R1
☆748Updated last month