oripress / AlgoTuneLinks
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
☆80Updated this week
Alternatives and similar repositories for AlgoTune
Users that are interested in AlgoTune are comparing it to the libraries listed below
Sorting:
- ☆33Updated last year
- ☆178Updated 3 weeks ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆125Updated 3 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆87Updated last year
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆63Updated this week
- ☆34Updated 7 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆215Updated 2 months ago
- Fluid Language Model Benchmarking☆25Updated 3 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆284Updated last month
- Evaluation of LLMs on latest math competitions☆211Updated 2 weeks ago
- Simple repository for training small reasoning models☆47Updated 11 months ago
- 📄Small Batch Size Training for Language Models☆77Updated 3 months ago
- ☆31Updated 9 months ago
- ☆33Updated last year
- Reinforcing General Reasoning without Verifiers☆93Updated 6 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆118Updated 2 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆136Updated last year
- ☆52Updated 9 months ago
- RLP: Reinforcement as a Pretraining Objective☆222Updated 3 months ago
- ☆91Updated last year
- ☆79Updated 2 months ago
- A basic pure pytorch implementation of flash attention☆16Updated last year
- Official implementation of GRAPE: Group Representational Position Encoding (https://arxiv.org/abs/2512.07805)☆70Updated this week
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆21Updated 6 months ago
- ☆19Updated 9 months ago
- ☆80Updated 3 months ago
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆40Updated 2 years ago
- Open source interpretability artefacts for R1.☆165Updated 8 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆26Updated 3 weeks ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆85Updated 9 months ago