Train the smallest LM you can that fits in 16MB. Best model wins!
☆1,545Mar 19, 2026Updated this week
Alternatives and similar repositories for parameter-golf
Users that are interested in parameter-golf are comparing it to the libraries listed below
Sorting:
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated last year
- ☆14Dec 12, 2024Updated last year
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆39Dec 2, 2025Updated 3 months ago
- ☆44Feb 20, 2026Updated last month
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 5 months ago
- ☆17Mar 24, 2024Updated last year
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆22Jul 4, 2025Updated 8 months ago
- ☆19Mar 25, 2025Updated 11 months ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆78Sep 4, 2023Updated 2 years ago
- Minimal open-source implementation of AlphaProof and HyperTree Proof Search.☆72Mar 9, 2026Updated last week
- Unix native interface to LLMs☆12Oct 16, 2025Updated 5 months ago
- Using deep research workflow to generate datasets for finetuning LLMs.☆39Oct 9, 2025Updated 5 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated this week
- Official implementation of "BERTs are Generative In-Context Learners"☆32Mar 14, 2025Updated last year
- ☆61Mar 14, 2026Updated last week
- Grokking on modular arithmetic in less than 150 epochs in MLX☆16Oct 24, 2024Updated last year
- Training tiny models to prove hard theorems☆59Mar 5, 2026Updated 2 weeks ago
- ☆13Apr 23, 2019Updated 6 years ago
- ☆28Sep 22, 2025Updated 5 months ago
- Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'☆19Dec 2, 2023Updated 2 years ago
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- jcat (jupyter cat) is a command line tool for viewing notebook(*.ipynb) files in terminal.☆10Sep 17, 2022Updated 3 years ago
- Official Code for Rectified LpJEPA: Joint-Embedding Predictive Architectures with Sparse and Maximum-Entropy Representations☆66Feb 15, 2026Updated last month
- Benchmark and analysis of 165 pretrained SSL models. Code for "Evaluating Self-Supervised Learning via Risk Decomposition".☆14Jul 26, 2023Updated 2 years ago
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆44Jun 18, 2022Updated 3 years ago
- ☆13Jan 4, 2023Updated 3 years ago
- A toolkit for dialogue system evaluation via crowdsourcing☆18Apr 25, 2023Updated 2 years ago
- The official code of "Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers"☆19Jul 24, 2024Updated last year
- dataset for Detecting and Explaining Causes From Text For a Time Series Event, EMNLP'17☆15Aug 31, 2020Updated 5 years ago
- Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2☆15Jun 27, 2025Updated 8 months ago
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- A repurpose of a Counter-Strike: Global Offensive cheat for in-game data collection and dataset creation.☆15Jan 15, 2022Updated 4 years ago
- Let LLMs play Counter-Strike 1.6☆16May 15, 2025Updated 10 months ago
- ☆76Feb 18, 2026Updated last month
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 9 months ago
- train with kittens!☆64Oct 25, 2024Updated last year
- my first ever browser game☆10Jun 21, 2025Updated 9 months ago
- Lightly-reviewed collection of community environments☆219Mar 12, 2026Updated last week