HumanCompatibleAI / leela-interp
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
☆17Updated 7 months ago
Alternatives and similar repositories for leela-interp:
Users that are interested in leela-interp are comparing it to the libraries listed below
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆25Updated 4 months ago
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆18Updated 9 months ago
- Minimal but scalable implementation of large language models in JAX☆28Updated 2 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆28Updated 2 months ago
- Scaling scaling laws with board games.☆45Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆42Updated last month
- Mechanistic Interpretability for Transformer Models☆49Updated 2 years ago
- Sparse and discrete interpretability tool for neural networks☆58Updated 11 months ago
- Sparse Autoencoder Training Library☆38Updated 2 months ago
- ☆13Updated 6 months ago
- Multiple datasets for ARC (Abstraction and Reasoning Corpus)☆48Updated last week
- we got you bro☆33Updated 5 months ago
- Code for minimum-entropy coupling.☆31Updated 6 months ago
- ☆48Updated 3 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆67Updated 2 years ago
- ☆25Updated 9 months ago
- Interpreting how transformers simulate agents performing RL tasks☆77Updated last year
- If it quacks like a tensor...☆55Updated 2 months ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆82Updated 11 months ago
- ☆11Updated last month
- Redwood Research's transformer interpretability tools☆13Updated 2 years ago
- A quick way to get started with Transformer Lens☆14Updated last year
- ☆31Updated last month
- A library for efficient patching and automatic circuit discovery.☆46Updated last month
- ☆31Updated 9 months ago
- ☆25Updated 2 months ago
- Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code☆10Updated last year
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆26Updated 7 months ago
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆18Updated last year