apple / ml-np-raspLinks
☆20Updated 2 years ago
Alternatives and similar repositories for ml-np-rasp
Users that are interested in ml-np-rasp are comparing it to the libraries listed below
Sorting:
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆132Updated 3 years ago
- Tools for studying developmental interpretability in neural networks.☆124Updated 3 weeks ago
- Mechanistic Interpretability for Transformer Models☆53Updated 3 years ago
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…☆216Updated last week
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- ☆28Updated 2 years ago
- Implementing RASP transformer programming language https://arxiv.org/pdf/2106.06981.pdf.☆59Updated 2 months ago
- A domain-specific probabilistic programming language for modeling and inference with language models☆140Updated 8 months ago
- ☆29Updated last year
- Extract full next-token probabilities via language model APIs☆248Updated last year
- Redwood Research's transformer interpretability tools☆15Updated 3 years ago
- ☆132Updated 2 years ago
- Language-annotated Abstraction and Reasoning Corpus☆99Updated 2 years ago
- Erasing concepts from neural representations with provable guarantees☆242Updated last year
- Neural Networks and the Chomsky Hierarchy☆212Updated last year
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆199Updated 2 years ago
- we got you bro☆37Updated last year
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Updated last year
- A library to create and manage configuration files, especially for machine learning projects.☆79Updated 3 years ago
- Sparse Autoencoder Training Library☆56Updated 8 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆61Updated 3 years ago
- Learning Universal Predictors☆81Updated last year
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆83Updated 3 years ago
- ☆76Updated last year
- See the issue board for the current status of active and prospective projects!☆65Updated 3 years ago
- git extension for {collaborative, communal, continual} model development☆217Updated last year
- ☆76Updated 3 years ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆63Updated 4 years ago
- Code associated to papers on superposition (in ML interpretability)☆35Updated 3 years ago