project-numina / aimo-progress-prize
☆418Updated 8 months ago
Alternatives and similar repositories for aimo-progress-prize:
Users that are interested in aimo-progress-prize are comparing it to the libraries listed below
- ☆574Updated 2 weeks ago
- ☆493Updated this week
- Large Reasoning Models☆800Updated 3 months ago
- A project to improve skills of large language models☆260Updated this week
- ☆325Updated last month
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym☆403Updated 3 weeks ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆283Updated last week
- Understanding R1-Zero-Like Training: A Critical Perspective☆725Updated this week
- ☆1,011Updated 3 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆260Updated 10 months ago
- ☆913Updated 2 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆311Updated 3 months ago
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆477Updated 2 weeks ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆300Updated 4 months ago
- ☆73Updated 8 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆191Updated 11 months ago
- procedural reasoning datasets☆541Updated this week
- A bibliography and survey of the papers surrounding o1☆1,183Updated 4 months ago
- ☆147Updated 10 months ago
- ☆312Updated 6 months ago
- ☆262Updated 2 weeks ago
- RewardBench: the first evaluation tool for reward models.☆532Updated last month
- ☆485Updated 7 months ago
- Automatic evals for LLMs☆346Updated this week
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆851Updated last month
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆209Updated last week
- Code for Quiet-STaR☆721Updated 7 months ago
- Retrieval-Augmented Theorem Provers for Lean☆262Updated 2 months ago
- ☆164Updated last month
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆186Updated 3 months ago