project-numina / aimo-progress-prize
☆318Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for aimo-progress-prize
- ☆226Updated 3 months ago
- ☆451Updated 3 weeks ago
- ☆252Updated last month
- Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"☆212Updated last month
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆835Updated 7 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆161Updated last month
- The official evaluation suite and dynamic data release for MixEval.☆224Updated last week
- [NeurlPS D&B 2024] Generative AI for Math: MathPile☆394Updated 3 weeks ago
- Large Reasoning Models☆580Updated this week
- A bibliography and survey of the papers surrounding o1☆754Updated this week
- ☆287Updated 2 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆219Updated 5 months ago
- Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality s…☆491Updated 2 weeks ago
- OLMoE: Open Mixture-of-Experts Language Models☆460Updated this week
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆438Updated 8 months ago
- Code for Quiet-STaR☆651Updated 3 months ago
- A pipeline to improve skills of large language models☆191Updated this week
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆647Updated last month
- ☆935Updated 2 weeks ago
- Official repository for ORPO☆421Updated 5 months ago
- AWM: Agent Workflow Memory☆205Updated last month
- ☆515Updated this week
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆204Updated this week
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…☆94Updated 4 months ago
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆714Updated 2 weeks ago
- ☆118Updated 6 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆181Updated this week
- RewardBench: the first evaluation tool for reward models.☆431Updated 3 weeks ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆307Updated 7 months ago