scienceetonnante / grokkingLinks
Demonstration of the grokking phenomenon in machine learning in a simple case
β63Updated 9 months ago
Alternatives and similar repositories for grokking
Users that are interested in grokking are comparing it to the libraries listed below
Sorting:
- The boundary of neural network trainability is fractalβ221Updated last year
- π§ Starter templates for doing interpretability researchβ75Updated 2 years ago
- Erasing concepts from neural representations with provable guaranteesβ239Updated 10 months ago
- LENS Projectβ51Updated last year
- β85Updated last year
- Benchmarks for the Evaluation of LLM Supervisionβ32Updated last month
- Parameter-Free Optimizers for Pytorchβ130Updated last year
- Tools for understanding how transformer predictions are built layer-by-layerβ546Updated 3 months ago
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.β174Updated 2 years ago
- Mechanistic Interpretability Visualizations using Reactβ302Updated 11 months ago
- Tools for studying developmental interpretability in neural networks.β114Updated 5 months ago
- This repository contains the code for the paper "Inferring Neural Activity Before Plasticity: A Foundation for Learning Beyond Backpropagβ¦β139Updated last year
- Neural Networks and the Chomsky Hierarchyβ211Updated last year
- Public repo for course material on Bayesian machine learning at ENS Paris-Saclay and Univ Lilleβ92Updated 9 months ago
- Software for Evolving Modular Robots in Unityβ20Updated 8 months ago
- Mutual information estimators and benchmarkβ54Updated 2 months ago
- Stanford NLP Python library for understanding and improving PyTorch models via interventionsβ833Updated last month
- β30Updated 2 years ago
- A lightweight library for Bayesian analysis of LLM evals (ICML 2025 Spotlight Position Paper)β21Updated 6 months ago
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).β323Updated 4 months ago
- Deep Learning, an Energy Approachβ222Updated 5 months ago
- ML has an impact on the climate. But not all models are born equal. Compute your model's emissions with our calculator and add the resultβ¦β244Updated last year
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.β232Updated 3 months ago
- Making your benchmark of optimization algorithms simple and openβ275Updated last week
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).β228Updated 11 months ago
- β548Updated last year
- Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.β236Updated 4 months ago
- Emergent world representations: Exploring a sequence model trained on a synthetic taskβ191Updated 2 years ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paperβ130Updated 3 years ago
- β18Updated last year