KindXiaoming / OmnigrokLinks

Omnigrok: Grokking Beyond Algorithmic Data

☆61

Alternatives and similar repositories for Omnigrok

Users that are interested in Omnigrok are comparing it to the libraries listed below

Sorting:

Sea-Snell / grokking
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆78Updated 3 years ago
locuslab / edge-of-stability
☆70Updated 8 months ago
danielmamay / grokking
Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.
☆38Updated last year
shikaiqiu / compute-better-spent
☆53Updated 10 months ago
ejmichaud / grokking-squared
☆26Updated 2 years ago
edwardjhu / TP4
Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)
☆62Updated 4 years ago
AhmedImtiazPrio / grok-adversarial
Deep Networks Grok All the Time and Here is Why
☆37Updated last year
Silent-Zebra / twisted-smc-lm
☆30Updated 4 months ago
KindXiaoming / BIMT
Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.
☆172Updated 2 years ago
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆37Updated 2 years ago
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆30Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆102Updated 2 months ago
machine-discovery / deer
Parallelizing non-linear sequential models over the sequence length
☆53Updated last month
AndPotap / einsum-search
☆32Updated 10 months ago
fjzzq2002 / pizza
Code repository for "The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks"
☆17Updated last year
srush / do-we-need-attention
☆166Updated 2 years ago
formll / dog
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
☆63Updated last year
dtsip / in-context-learning
☆235Updated last year
berlino / seq_icl
☆53Updated last year
ApolloResearch / e2e_sae
Sparse Autoencoder Training Library
☆54Updated 3 months ago
xu-ji / information-bottleneck
Deep Learning & Information Bottleneck
☆61Updated 2 years ago
mechanistic-interpretability-grokking / progress-measures-paper
☆68Updated 2 years ago
DeqingFu / transformers-icl-second-order
Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…
☆18Updated 8 months ago
adamkarvonen / SAE_BoardGameEval
☆23Updated 6 months ago
stanislavfort / dissect-git-re-basin
Replicating and dissecting the git-re-basin project in one-click-replication Colabs
☆36Updated 2 years ago
bilal-chughtai / rep-theory-mech-interp
☆26Updated 2 years ago
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆85Updated last year
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆63Updated last year
Leiay / looped_transformer
☆31Updated last year
srush / mamba-primer
☆37Updated last year