antonpk1 / stackfishLinks
Stackfish is an open-source LLM-powered pipeline designed to automatically solve competitive programming problems.
☆49Updated last year
Alternatives and similar repositories for stackfish
Users that are interested in stackfish are comparing it to the libraries listed below
Sorting:
- ☆55Updated last year
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆99Updated 5 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆121Updated 2 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆71Updated this week
- Cray-LM unified training and inference stack.☆22Updated 10 months ago
- ☆59Updated 10 months ago
- ☆68Updated 7 months ago
- Simple repository for training small reasoning models☆47Updated 10 months ago
- Make triton easier☆49Updated last year
- LLM reads a paper and produce a working prototype☆60Updated 8 months ago
- II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆31Updated 8 months ago
- Commit0: Library Generation from Scratch☆173Updated 7 months ago
- Train an agent to generate high quality summaries☆39Updated 5 months ago
- Pivotal Token Search☆135Updated this week
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 8 months ago
- Small, simple agent task environments for training and evaluation☆19Updated last year
- 🤖 Complete reproduction of 'AlphaGo Moment for Model Architecture Discovery' using MLX-LM instead of GPT-4. Autonomous neural architectu…☆26Updated 4 months ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆65Updated 10 months ago
- Track the progress of LLM context utilisation☆55Updated 8 months ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated 3 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆63Updated 7 months ago
- Simple GRPO scripts and configurations.☆59Updated 10 months ago
- ☆19Updated 9 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆233Updated last month
- Compiling useful links, papers, benchmarks, ideas, etc.☆45Updated 9 months ago
- A collection of reproducible inference engine benchmarks☆38Updated 8 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 8 months ago
- Simple high-throughput inference library☆153Updated 7 months ago
- EXO Gym is an open-source Python toolkit that facilitates distributed AI research.☆88Updated 3 weeks ago
- Alice in Wonderland code base for experiments and raw experiments data☆131Updated 3 months ago