oripress / AlgoTuneLinks
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
☆85Updated this week
Alternatives and similar repositories for AlgoTune
Users that are interested in AlgoTune are comparing it to the libraries listed below
Sorting:
- ☆33Updated last year
- Fluid Language Model Benchmarking☆25Updated 4 months ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆64Updated this week
- ☆31Updated 10 months ago
- ☆85Updated this week
- ☆37Updated 8 months ago
- ☆186Updated last week
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆88Updated last year
- moodist☆24Updated 3 weeks ago
- ☆93Updated last week
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆55Updated 6 months ago
- ☆82Updated 4 months ago
- Simple repository for training small reasoning models☆48Updated 11 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆127Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 10 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆28Updated 10 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆288Updated 2 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆198Updated last year
- ☆91Updated last year
- ☆52Updated 10 months ago
- Reinforcing General Reasoning without Verifiers☆93Updated 7 months ago
- 📄Small Batch Size Training for Language Models☆80Updated 3 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆135Updated last year
- Evaluation of LLMs on latest math competitions☆213Updated last month
- Code for the paper "Function-Space Learning Rates"☆23Updated 7 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 9 months ago
- Universal Reasoning Model☆121Updated 2 weeks ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆26Updated last month
- [ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective☆226Updated this week
- implementation of dualformer☆24Updated 10 months ago