Sea-Snell / grokking
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆76Updated 2 years ago
Alternatives and similar repositories for grokking:
Users that are interested in grokking are comparing it to the libraries listed below
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆36Updated last year
- Omnigrok: Grokking Beyond Algorithmic Data☆54Updated 2 years ago
- PyTorch implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆36Updated 3 years ago
- Scaling scaling laws with board games.☆48Updated last year
- ☆173Updated last year
- Language-annotated Abstraction and Reasoning Corpus☆84Updated last year
- ☆61Updated 2 years ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆61Updated 3 years ago
- ☆24Updated 2 years ago
- ☆26Updated 11 months ago
- Universal Neurons in GPT2 Language Models☆27Updated 10 months ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆36Updated 2 years ago
- Mechanistic Interpretability for Transformer Models☆50Updated 2 years ago
- ☆25Updated this week
- Sparse Autoencoder Training Library☆47Updated 5 months ago
- ☆26Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆76Updated 3 weeks ago
- Redwood Research's transformer interpretability tools☆14Updated 2 years ago
- ☆78Updated last year
- ☆51Updated 10 months ago
- Code associated to papers on superposition (in ML interpretability)☆28Updated 2 years ago
- ☆114Updated 7 months ago
- ☆90Updated last month
- ☆66Updated 4 months ago
- nanoGPT-like codebase for LLM training☆91Updated this week
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆187Updated 10 months ago
- LoRA for arbitrary JAX models and functions☆135Updated last year
- ☆65Updated 3 months ago
- Tools for studying developmental interpretability in neural networks.☆87Updated 2 months ago
- Materials for ConceptARC paper☆90Updated 4 months ago