oripress / AlgoTuneLinks
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
☆60Updated this week
Alternatives and similar repositories for AlgoTune
Users that are interested in AlgoTune are comparing it to the libraries listed below
Sorting:
- ☆33Updated 8 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆37Updated last week
- Reinforcing General Reasoning without Verifiers☆83Updated 3 months ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆55Updated 2 months ago
- 📄Small Batch Size Training for Language Models☆62Updated last month
- Simple repository for training small reasoning models☆40Updated 7 months ago
- Evaluation of LLMs on latest math competitions☆165Updated last week
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆84Updated 10 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆20Updated 6 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆99Updated last month
- Open Source Replication of Anthropic's Alignment Faking Paper☆50Updated 5 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆77Updated 6 months ago
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆21Updated 2 months ago
- ☆20Updated last month
- ☆31Updated 6 months ago
- Esoteric Language Models☆99Updated 2 months ago
- ☆35Updated 4 months ago
- ☆40Updated 3 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆111Updated last month
- EvaByte: Efficient Byte-level Language Models at Scale☆109Updated 5 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆193Updated last year
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆147Updated last week
- ☆122Updated 7 months ago
- Open source interpretability artefacts for R1.☆159Updated 5 months ago
- ☆56Updated 10 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆95Updated last month
- Harmonic Datasets☆47Updated last year
- Code for "Reasoning to Learn from Latent Thoughts"☆118Updated 5 months ago
- ☆85Updated last year
- ☆27Updated 3 months ago