scienceetonnante / grokkingLinks

Demonstration of the grokking phenomenon in machine learning in a simple case

☆63

Alternatives and similar repositories for grokking

Users that are interested in grokking are comparing it to the libraries listed below

Sorting:

Sohl-Dickstein / fractal
The boundary of neural network trainability is fractal
☆221Updated last year
apartresearch / interpretability-starter
🧠 Starter templates for doing interpretability research
☆75Updated 2 years ago
EleutherAI / concept-erasure
Erasing concepts from neural representations with provable guarantees
☆239Updated 10 months ago
serre-lab / Lens
LENS Project
☆51Updated last year
jessicarumbelow / Backwards
☆85Updated last year
CentreSecuriteIA / BELLS
Benchmarks for the Evaluation of LLM Supervision
☆32Updated last month
bremen79 / parameterfree
Parameter-Free Optimizers for Pytorch
☆130Updated last year
AlignmentResearch / tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
☆546Updated 3 months ago
KindXiaoming / BIMT
Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.
☆174Updated 2 years ago
TransformerLensOrg / CircuitsVis
Mechanistic Interpretability Visualizations using React
☆302Updated 11 months ago
timaeus-research / devinterp
Tools for studying developmental interpretability in neural networks.
☆114Updated 5 months ago
YuhangSong / Prospective-Configuration
This repository contains the code for the paper "Inferring Neural Activity Before Plasticity: A Foundation for Learning Beyond Backpropag…
☆139Updated last year
google-deepmind / neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
☆211Updated last year
rbardenet / bml-course
Public repo for course material on Bayesian machine learning at ENS Paris-Saclay and Univ Lille
☆92Updated 9 months ago
FrankVeenstra / EvolvingModularRobots_Unity
Software for Evolving Modular Robots in Unity
☆20Updated 8 months ago
cbg-ethz / bmi
Mutual information estimators and benchmark
☆54Updated 2 months ago
stanfordnlp / pyvene
Stanford NLP Python library for understanding and improving PyTorch models via interventions
☆833Updated last month
ellisk42 / humanlike_fewshot_learning
☆30Updated 2 years ago
sambowyer / bayes_evals
A lightweight library for Bayesian analysis of LLM evals (ICML 2025 Spotlight Position Paper)
☆21Updated 6 months ago
Prisma-Multimodal / ViT-Prisma
ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).
☆323Updated 4 months ago
Atcold / Energy-Book
Deep Learning, an Energy Approach
☆222Updated 5 months ago
mlco2 / impact
ML has an impact on the climate. But not all models are born equal. Compute your model's emissions with our calculator and add the result…
☆244Updated last year
callummcdougall / ARENA_2.0
Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.
☆232Updated 3 months ago
benchopt / benchopt
Making your benchmark of optimization algorithms simple and open
☆275Updated last week
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆228Updated 11 months ago
google-deepmind / tracr
☆548Updated last year
chr5tphr / zennit
Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.
☆236Updated 4 months ago
likenneth / othello_world
Emergent world representations: Exploring a sequence model trained on a synthetic task
☆191Updated 2 years ago
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆130Updated 3 years ago
marikgoldstein / slides
☆18Updated last year