apartresearch / Neuron2GraphLinks
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
☆23Updated 2 years ago
Alternatives and similar repositories for Neuron2Graph
Users that are interested in Neuron2Graph are comparing it to the libraries listed below
Sorting:
- Materials for ConceptARC paper☆110Updated last year
- ☆28Updated 2 years ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆46Updated last year
- ☆214Updated 2 years ago
- Universal Neurons in GPT2 Language Models☆31Updated last year
- Code for 'Emergent Analogical Reasoning in Large Language Models'☆51Updated last year
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆197Updated 2 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆132Updated 3 years ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆136Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆106Updated 2 years ago
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆20Updated 2 months ago
- Sparse and discrete interpretability tool for neural networks☆65Updated last year
- ☆73Updated 3 years ago
- Applying Behaviorally-Informed Meta-Learning (BIML) to machine learning benchmarks☆52Updated 2 years ago
- Extracting spatial and temporal world models from LLMs☆257Updated 2 years ago
- A virtual environment for developing and evaluating automated scientific discovery agents.☆198Updated 10 months ago
- Meta-Learning for Compositionality (MLC) for modeling human behavior☆145Updated last month
- Sparse Autoencoder Training Library☆56Updated 8 months ago
- Discovering Data-driven Hypotheses in the Wild☆124Updated 7 months ago
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆210Updated 2 years ago
- Evaluation of neuro-symbolic engines☆40Updated last year
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆73Updated last year
- ☆25Updated last year
- Attribution-based Parameter Decomposition☆33Updated 7 months ago
- ☆135Updated last year
- A library for efficient patching and automatic circuit discovery.☆84Updated last week
- Learning Universal Predictors☆81Updated last year
- A repository for transformer critique learning and generation☆89Updated 2 years ago
- ☆56Updated 2 years ago