project-numina / aimo-progress-prizeLinks
☆482Updated last year
Alternatives and similar repositories for aimo-progress-prize
Users that are interested in aimo-progress-prize are comparing it to the libraries listed below
Sorting:
- Technical report of Kimina-Prover Preview.☆359Updated 7 months ago
- A project to improve skills of large language models☆813Updated this week
- ☆554Updated last year
- ☆1,033Updated last year
- ☆224Updated 10 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆627Updated 2 weeks ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆343Updated 3 months ago
- ☆1,088Updated last month
- GPQA: A Graduate-Level Google-Proof Q&A Benchmark☆466Updated last year
- Evaluation of LLMs on latest math competitions☆216Updated last week
- ☆342Updated 8 months ago
- Automatic evals for LLMs☆579Updated last month
- [COLM 2025] LIMO: Less is More for Reasoning☆1,062Updated 6 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,332Updated 3 weeks ago
- Reproducible, flexible LLM evaluations☆338Updated 2 weeks ago
- A simple unified framework for evaluating LLMs☆261Updated 9 months ago
- ☆564Updated last year
- RewardBench: the first evaluation tool for reward models.☆687Updated last week
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆285Updated last year
- Retrieval-Augmented Theorem Provers for Lean☆316Updated last year
- PyTorch building blocks for the OLMo ecosystem☆785Updated this week
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆273Updated last year
- ☆411Updated last month
- [MathCoder, MathCoder-VL] Family of LLMs/LMMs for mathematical reasoning.☆339Updated 3 months ago
- [NeurlPS D&B 2024] Generative AI for Math: MathPile☆419Updated 10 months ago
- ☆167Updated last year
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆350Updated last week
- Large Reasoning Models☆807Updated last year
- ☆330Updated 8 months ago
- A bibliography and survey of the papers surrounding o1☆1,212Updated last year