oripress / AlgoTuneLinks
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
☆86Updated last week
Alternatives and similar repositories for AlgoTune
Users that are interested in AlgoTune are comparing it to the libraries listed below
Sorting:
- ☆33Updated last year
- ☆191Updated 2 weeks ago
- ☆37Updated 8 months ago
- ☆31Updated 10 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆128Updated 3 months ago
- Evaluation of LLMs on latest math competitions☆214Updated last month
- ☆394Updated last week
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆88Updated last year
- Universal Reasoning Model☆122Updated 3 weeks ago
- ☆148Updated this week
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆55Updated 6 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆137Updated last year
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆288Updated 2 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆28Updated 11 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 10 months ago
- [ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective☆231Updated last week
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆198Updated last year
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆65Updated last week
- Fluid Language Model Benchmarking☆26Updated 4 months ago
- Reinforcing General Reasoning without Verifiers☆96Updated 7 months ago
- Simple repository for training small reasoning models☆48Updated last year
- AIRA-dojo: a framework for developing and evaluating AI research agents☆125Updated 2 weeks ago
- Minimum Description Length probing for neural network representations☆20Updated last year
- ☆134Updated 4 months ago
- ☆52Updated 10 months ago
- Can Language Models Solve Olympiad Programming?☆123Updated last year
- Harmonic Datasets☆52Updated last year
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆42Updated 2 years ago
- moodist☆24Updated last month
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆36Updated 11 months ago