QwenLM / Qwen2.5-MathLinks
A series of math-specific large language models of our Qwen2 series.
☆960Updated 5 months ago
Alternatives and similar repositories for Qwen2.5-Math
Users that are interested in Qwen2.5-Math are comparing it to the libraries listed below
Sorting:
- Scalable RL solution for advanced reasoning of language models☆1,642Updated 3 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,502Updated last month
- ☆529Updated 10 months ago
- ☆1,356Updated 7 months ago
- Large Reasoning Models☆805Updated 7 months ago
- ☆796Updated last month
- ☆808Updated last week
- ☆580Updated 2 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,012Updated last week
- Official Repo for Open-Reasoner-Zero☆1,983Updated last month
- O1 Replication Journey☆1,991Updated 5 months ago
- State-of-the-art bilingual open-sourced Math reasoning LLMs.☆515Updated 8 months ago
- DataComp for Language Models☆1,322Updated 3 months ago
- LIMO: Less is More for Reasoning☆975Updated 3 months ago
- ☆725Updated last month
- ☆943Updated 5 months ago
- ReasonFlux Series - Open-source innovative LLM post-training algorithms focusing on data selection, reinforcement learning, and inference…☆442Updated this week
- [NeurlPS D&B 2024] Generative AI for Math: MathPile☆414Updated 3 months ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆419Updated last month
- Muon is Scalable for LLM Training☆1,091Updated 3 months ago
- ☆1,083Updated last year
- Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".☆619Updated 2 weeks ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆643Updated 5 months ago
- A series of technical report on Slow Thinking with LLM☆706Updated last month
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆267Updated 4 months ago
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,406Updated last month
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆904Updated 4 months ago
- Pretraining code for a large-scale depth-recurrent language model☆788Updated 3 weeks ago
- Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"☆582Updated 2 weeks ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆244Updated 2 months ago