antonpk1 / stackfishLinks
Stackfish is an open-source LLM-powered pipeline designed to automatically solve competitive programming problems.
☆55Updated last year
Alternatives and similar repositories for stackfish
Users that are interested in stackfish are comparing it to the libraries listed below
Sorting:
- Pivotal Token Search☆144Updated last month
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 9 months ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆102Updated 6 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆89Updated 2 weeks ago
- LLM reads a paper and produce a working prototype☆60Updated 9 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 3 months ago
- ☆55Updated last year
- A collection of reproducible inference engine benchmarks☆38Updated 9 months ago
- A mcp server that uses the Osmosis-Apply-1.7B model to apply code merges☆53Updated 7 months ago
- Simple high-throughput inference library☆155Updated 8 months ago
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆53Updated 10 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆128Updated 3 months ago
- ☆67Updated 8 months ago
- Very minimal (and stateless) agent framework☆44Updated last year
- Small, simple agent task environments for training and evaluation☆19Updated last year
- alphaxiv open source alternative☆108Updated 8 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆52Updated last year
- Framework-Agnostic RL Environments for LLM Fine-Tuning☆42Updated last week
- ☆62Updated 6 months ago
- Simple repository for training small reasoning models☆49Updated last year
- ☆24Updated last year
- ☆56Updated last year
- Lego for GRPO☆30Updated 8 months ago
- ☆20Updated 11 months ago
- Training setup for Langchain's Open Deep Research☆74Updated 5 months ago
- A benchmark for conversational bargaining by language models. In each 20‑round match one LLM plays buyer, one plays seller, and both hold…☆33Updated 5 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆65Updated 9 months ago
- Efficient non-uniform quantization with GPTQ for GGUF☆58Updated 4 months ago
- II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆31Updated 9 months ago