project-numina / aimo-progress-prizeLinks

☆460

Alternatives and similar repositories for aimo-progress-prize

Users that are interested in aimo-progress-prize are comparing it to the libraries listed below

Sorting:

MoonshotAI / Kimina-Prover-Preview
Technical report of Kimina-Prover Preview.
☆320Updated 3 weeks ago
NVIDIA / NeMo-Skills
A project to improve skills of large language models
☆501Updated this week
microsoft / rStar
☆608Updated 3 weeks ago
ekinakyurek / marc
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
☆321Updated 8 months ago
huggingface / Math-Verify
☆870Updated last month
trotsky1997 / MathBlackBox
☆1,028Updated 7 months ago
deepseek-ai / DeepSeek-Prover-V1.5
☆531Updated 11 months ago
eth-sri / matharena
Evaluation of LLMs on latest math competitions
☆155Updated 2 weeks ago
sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆425Updated last week
mlfoundations / evalchemy
Automatic evals for LLMs
☆496Updated last month
idavidrein / gpqa
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
☆378Updated 10 months ago
open-thought / reasoning-gym
procedural reasoning datasets
☆1,012Updated this week
allenai / olmes
Reproducible, flexible LLM evaluations
☆227Updated 3 weeks ago
lean-dojo / ReProver
Retrieval-Augmented Theorem Provers for Lean
☆287Updated 6 months ago
Goedel-LM / Goedel-Prover
☆194Updated 4 months ago
allenai / OLMo-core
PyTorch building blocks for the OLMo ecosystem
☆269Updated this week
NovaSky-AI / SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
☆679Updated last week
allenai / OLMoE
OLMoE: Open Mixture-of-Experts Language Models
☆830Updated 4 months ago
SWE-Gym / SWE-Gym
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆516Updated last week
SimpleBerry / LLaMA-O1
Large Reasoning Models
☆804Updated 8 months ago
GAIR-NLP / MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
☆415Updated 4 months ago
WildEval / ZeroEval
A simple unified framework for evaluating LLMs
☆235Updated 3 months ago
GAIR-NLP / LIMO
[COLM 2025] LIMO: Less is More for Reasoning
☆993Updated last week
huggingface / cosmopedia
☆529Updated 8 months ago
McGill-NLP / nano-aha-moment
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆512Updated 3 weeks ago
waterhorse1 / LLM_Tree_Search
(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training
☆278Updated last year
MARIO-Math-Reasoning / Super_MARIO
☆337Updated 2 months ago
sail-sg / understand-r1-zero
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,055Updated last week
OSU-NLP-Group / GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆226Updated 2 weeks ago
da-fr / arc-prize-2024
Our solution for the arc challenge 2024
☆166Updated last month