oripress / AlgoTuneLinks
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
☆75Updated 3 weeks ago
Alternatives and similar repositories for AlgoTune
Users that are interested in AlgoTune are comparing it to the libraries listed below
Sorting:
- ☆33Updated 11 months ago
- ☆169Updated last week
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆58Updated 2 months ago
- Evaluation of LLMs on latest math competitions☆204Updated last month
- ☆34Updated 7 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆277Updated 3 weeks ago
- ☆74Updated last month
- 📄Small Batch Size Training for Language Models☆68Updated 2 months ago
- ☆52Updated 9 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆86Updated last year
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆121Updated 2 months ago
- ☆77Updated 2 months ago
- RLP: Reinforcement as a Pretraining Objective☆213Updated 2 months ago
- ☆162Updated 4 months ago
- Fluid Language Model Benchmarking☆22Updated 3 months ago
- ☆31Updated 8 months ago
- Reinforcing General Reasoning without Verifiers☆92Updated 5 months ago
- Open source interpretability artefacts for R1.☆165Updated 7 months ago
- Simple repository for training small reasoning models☆47Updated 10 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆120Updated last month
- The official github repo for "Diffusion Language Models are Super Data Learners".☆212Updated last month
- Defeating the Training-Inference Mismatch via FP16☆163Updated last month
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 9 months ago
- SWE Arena☆35Updated 5 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆27Updated 9 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆84Updated 9 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆115Updated last month
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆36Updated last month
- ☆110Updated last year
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆149Updated 2 months ago