project-numina / aimo-progress-prizeLinks
☆466Updated last year
Alternatives and similar repositories for aimo-progress-prize
Users that are interested in aimo-progress-prize are comparing it to the libraries listed below
Sorting:
- Technical report of Kimina-Prover Preview.☆338Updated 3 months ago
- A project to improve skills of large language models☆594Updated this week
- ☆981Updated 4 months ago
- ☆1,035Updated 10 months ago
- Automatic evals for LLMs☆550Updated 4 months ago
- Reproducible, flexible LLM evaluations☆260Updated last week
- GPQA: A Graduate-Level Google-Proof Q&A Benchmark☆420Updated last year
- ☆540Updated last year
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆546Updated last week
- ☆209Updated 6 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆333Updated 11 months ago
- ☆546Updated 11 months ago
- Evaluation of LLMs on latest math competitions☆175Updated 2 weeks ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,202Updated 3 weeks ago
- ☆342Updated 4 months ago
- Recipes to scale inference-time compute of open models☆1,114Updated 5 months ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆267Updated 2 weeks ago
- RewardBench: the first evaluation tool for reward models.☆646Updated 4 months ago
- ☆165Updated last year
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆259Updated last year
- Retrieval-Augmented Theorem Provers for Lean☆299Updated 9 months ago
- [MathCoder, MathCoder-VL] Family of LLMs/LMMs for mathematical reasoning.☆327Updated 2 weeks ago
- A bibliography and survey of the papers surrounding o1☆1,208Updated 11 months ago
- Large Reasoning Models☆806Updated 11 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆562Updated 3 months ago
- PyTorch building blocks for the OLMo ecosystem☆311Updated this week
- A simple unified framework for evaluating LLMs☆254Updated 6 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆544Updated 3 weeks ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆283Updated last year
- Code for Quiet-STaR☆739Updated last year