Sea-Snell / grokkingLinks

unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"

☆79

Alternatives and similar repositories for grokking

Users that are interested in grokking are comparing it to the libraries listed below

Sorting:

danielmamay / grokking
Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.
☆39Updated 2 years ago
likenneth / othello_world
Emergent world representations: Exploring a sequence model trained on a synthetic task
☆191Updated 2 years ago
KindXiaoming / Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
☆62Updated 2 years ago
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆129Updated 3 years ago
google-deepmind / neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
☆210Updated last year
ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated last year
GFNOrg / gfn-lm-tuning
☆185Updated last year
lee-ny / teaching_arithmetic
☆83Updated 2 years ago
mechanistic-interpretability-grokking / progress-measures-paper
☆69Updated 3 years ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆109Updated 5 months ago
TomFrederik / unseal
Mechanistic Interpretability for Transformer Models
☆53Updated 3 years ago
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆40Updated 2 years ago
redwoodresearch / Easy-Transformer
☆126Updated last year
edwardjhu / TP4
Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)
☆63Updated 4 years ago
bilal-chughtai / rep-theory-mech-interp
☆27Updated 2 years ago
srush / do-we-need-attention
☆166Updated 2 years ago
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆162Updated last year
p-lambda / incontext-learning
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…
☆108Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆131Updated 10 months ago
aks2203 / easy-to-hard
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"
☆59Updated 3 years ago
dtsip / in-context-learning
☆240Updated last year
ApolloResearch / e2e_sae
Sparse Autoencoder Training Library
☆55Updated 5 months ago
KihoPark / linear_rep_geometry
☆107Updated 8 months ago
timaeus-research / devinterp
Tools for studying developmental interpretability in neural networks.
☆111Updated 4 months ago
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆30Updated last year
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆209Updated 2 years ago
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆141Updated last year
ejmichaud / grokking-squared
☆26Updated 2 years ago
krandiash / quinine
A library to create and manage configuration files, especially for machine learning projects.
☆80Updated 3 years ago
teddykoker / grokking
PyTorch implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆37Updated 3 years ago