project-numina / aimo-progress-prize
☆433Updated 9 months ago
Alternatives and similar repositories for aimo-progress-prize
Users that are interested in aimo-progress-prize are comparing it to the libraries listed below
Sorting:
- Technical report of Kimina-Prover Preview.☆278Updated last week
- ☆527Updated last month
- A project to improve skills of large language models☆383Updated this week
- ☆691Updated 2 weeks ago
- ☆328Updated 3 months ago
- ☆518Updated 9 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆211Updated last year
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆517Updated 2 months ago
- ☆1,019Updated 5 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆191Updated 5 months ago
- ☆151Updated last year
- [ICLR 2024] Family of LLMs for mathematical reasoning.☆263Updated 5 months ago
- [NeurlPS D&B 2024] Generative AI for Math: MathPile☆412Updated last month
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆352Updated last week
- ☆76Updated 10 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆307Updated 5 months ago
- ☆515Updated 5 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆411Updated last year
- Retrieval-Augmented Theorem Provers for Lean☆272Updated 3 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆156Updated this week
- Large Reasoning Models☆805Updated 5 months ago
- GPQA: A Graduate-Level Google-Proof Q&A Benchmark☆346Updated 7 months ago
- State-of-the-art bilingual open-sourced Math reasoning LLMs.☆508Updated 6 months ago
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…☆145Updated 10 months ago
- ☆176Updated last month
- RewardBench: the first evaluation tool for reward models.☆566Updated last week
- A bibliography and survey of the papers surrounding o1☆1,192Updated 6 months ago
- LLMs + Lean, on your laptop or in the cloud☆149Updated last month
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆261Updated 8 months ago
- ☆315Updated 7 months ago