imagination-research / aimo2Links
AIMO2 2nd place solution
โ57Updated last week
Alternatives and similar repositories for aimo2
Users that are interested in aimo2 are comparing it to the libraries listed below
Sorting:
- โ63Updated 6 months ago
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ106Updated 5 months ago
- โ75Updated 10 months ago
- A version of verl to support tool useโ172Updated this week
- โ202Updated 3 months ago
- โ295Updated last week
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"โ186Updated 3 months ago
- โ210Updated 2 weeks ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learningโ215Updated 3 weeks ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMsโ147Updated 2 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuningโ141Updated 5 months ago
- Repository of LV-Eval Benchmarkโ65Updated 9 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)โ106Updated 2 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ220Updated last year
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"โ155Updated this week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoningโ180Updated 2 months ago
- Reproducing R1 for Code with Reliable Rewardsโ208Updated last month
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".โ95Updated 2 months ago
- โ95Updated 2 weeks ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"โ155Updated 2 weeks ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.โ62Updated 3 months ago
- โ104Updated last month
- Async pipelined version of Verlโ91Updated last month
- "what, how, where, and how well? a survey on test-time scaling in large language models" repositoryโ42Updated this week
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scalingโ104Updated 4 months ago
- Repo of paper "Free Process Rewards without Process Labels"โ149Updated 2 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witโฆโ129Updated 10 months ago
- โ210Updated last week
- Simple extension on vLLM to help you speed up reasoning model without training.โ158Updated this week
- โ69Updated 6 months ago