antonpk1 / stackfishLinks
Stackfish is an open-source LLM-powered pipeline designed to automatically solve competitive programming problems.
☆43Updated 6 months ago
Alternatives and similar repositories for stackfish
Users that are interested in stackfish are comparing it to the libraries listed below
Sorting:
- ☆54Updated last year
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆82Updated 3 weeks ago
- ☆63Updated last month
- ☆41Updated 5 months ago
- Simple repository for training small reasoning models☆33Updated 4 months ago
- Building large language foundational model☆9Updated 3 months ago
- This repository contain the simple llama3 implementation in pure jax.☆66Updated 4 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆31Updated 2 months ago
- ☆41Updated last month
- ☆20Updated last year
- LLM reads a paper and produce a working prototype☆57Updated 2 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆68Updated 3 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆76Updated last year
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- Split model weights and execute partially☆4Updated 11 months ago
- Compiling useful links, papers, benchmarks, ideas, etc.☆46Updated 3 months ago
- In this repository I have a code and brief explanations of the attempts that I made at the ARC-AGI (2024) challenges :)☆23Updated 7 months ago
- Official Code Release for "Training a Generally Curious Agent"☆25Updated last month
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆18Updated last month
- Pivotal Token Search☆107Updated last month
- Some experiments on transformer models☆11Updated last year
- ☆45Updated 9 months ago
- ☆64Updated 8 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆32Updated last month
- II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆20Updated 2 months ago
- ☆28Updated this week
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated 3 months ago
- 👷♂️Minion is Agent's Brain. Minion is designed to execute any type of queries, offering a variety of features that demonstrate its flex…☆22Updated 2 weeks ago
- My solutions for Advanced Python Mastery (course by @dabeaz)☆11Updated last year
- Verbosity control for AI agents☆63Updated last year