6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.
☆21Feb 19, 2026Updated last week
Alternatives and similar repositories for smallest-addition-transformer-claude-code
Users that are interested in smallest-addition-transformer-claude-code are comparing it to the libraries listed below
Sorting:
- ☆29Nov 30, 2025Updated 3 months ago
- ☆46Jul 21, 2025Updated 7 months ago
- ☆25Feb 20, 2026Updated last week
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 6 years ago
- Benchmarking Optimizers for LLM Pretraining☆52Dec 30, 2025Updated 2 months ago
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆62Feb 22, 2026Updated last week
- ☆56Sep 17, 2025Updated 5 months ago
- ☆35Jul 5, 2023Updated 2 years ago
- 100M tokens, no time limit, best val loss wins!☆103Updated this week
- ☆84Aug 31, 2023Updated 2 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- AlgZoo: uninterpreted models with fewer than 1,500 parameters☆43Jan 19, 2026Updated last month
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated 11 months ago
- Run a raffle among the 🌟 stargazers 🌟 of a Github project!☆11Mar 23, 2023Updated 2 years ago
- Code for "What really matters in matrix-whitening optimizers?"☆21Oct 31, 2025Updated 4 months ago
- Provide a simple, script-friendly interface to posting stuff on matrix channels☆10Mar 9, 2018Updated 7 years ago
- Simple MoE - Day 17 of 365 Days of Repos☆16Jan 17, 2025Updated last year
- Advent Of Code solutions in Haskell☆11Dec 8, 2019Updated 6 years ago
- Streaming effects for PureScript☆16Nov 8, 2021Updated 4 years ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Automated Design of Agentic Systems☆10Sep 7, 2024Updated last year
- ☆11Jan 4, 2023Updated 3 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- This is a repository for RM2021 Software tutorial☆11Nov 4, 2020Updated 5 years ago
- Repository for my dotfiles☆12Dec 3, 2025Updated 2 months ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- Pytorch routines for (Ker)nel (Mac)hines☆10Oct 10, 2025Updated 4 months ago
- Simple parsing of CSV into case classes in Scala☆11Feb 12, 2025Updated last year
- [ICLR26] AI-based scaling law discovery☆26Jan 30, 2026Updated last month
- Tutorials for MATH 4432 Statistical Machine Learning, HKUST, Fall 2022☆11Sep 17, 2024Updated last year
- The code is for our AAAI2023 paper: Efficient Embeddings of Logical Variables for Query Answering over Incomplete Knowledge Graphs (Ding…☆10Dec 17, 2022Updated 3 years ago
- A guide for those who are stuck not being able to migrate their apps away from JS/TypeScript/Flow. See index.tsx☆10Nov 15, 2017Updated 8 years ago
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11Oct 10, 2025Updated 4 months ago
- For a given Haskell source file, determine where a symbol is imported from☆27Nov 16, 2018Updated 7 years ago
- ☆12Oct 23, 2024Updated last year
- A reaaaaaally lenient HTML parser for Purescript inspired by ndmitchell's TagSoup☆12Jul 27, 2018Updated 7 years ago
- ☆18Apr 16, 2025Updated 10 months ago
- Haskell implementation of Glumpy☆12Jun 21, 2021Updated 4 years ago
- A glowfic to epub converter.☆14Jan 25, 2026Updated last month