apartresearch / Neuron2GraphLinks
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
☆22Updated 2 years ago
Alternatives and similar repositories for Neuron2Graph
Users that are interested in Neuron2Graph are comparing it to the libraries listed below
Sorting:
- ☆27Updated 2 years ago
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆20Updated last month
- Code for 'Emergent Analogical Reasoning in Large Language Models'☆51Updated last year
- ☆14Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆97Updated 2 years ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆45Updated last year
- Materials for ConceptARC paper☆108Updated last year
- Extracting spatial and temporal world models from LLMs☆257Updated 2 years ago
- Universal Neurons in GPT2 Language Models☆31Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆130Updated 3 years ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆106Updated 2 years ago
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- ☆70Updated 3 years ago
- ☆142Updated 4 months ago
- ☆119Updated last month
- Google Research☆46Updated 3 years ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆133Updated last year
- Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.☆132Updated last year
- Can Language Models Solve Olympiad Programming?☆122Updated 10 months ago
- A repository for transformer critique learning and generation☆89Updated last year
- Discovering Data-driven Hypotheses in the Wild☆118Updated 5 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated 2 years ago
- A library for efficient patching and automatic circuit discovery.☆80Updated 4 months ago
- ☆25Updated last year
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆73Updated last year
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆191Updated 2 years ago
- [NeurIPS 2023] Learning Transformer Programs☆162Updated last year
- ☆111Updated 9 months ago