aadityasingh / icl-dynamics
☆14Updated 5 months ago
Related projects: ⓘ
- ☆99Updated 10 months ago
- ☆33Updated 3 months ago
- ☆54Updated last week
- ☆75Updated this week
- ☆68Updated 7 months ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆26Updated last month
- Sparse and discrete interpretability tool for neural networks☆51Updated 7 months ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated 10 months ago
- ☆110Updated 3 weeks ago
- ☆23Updated last year
- ☆48Updated 4 months ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆28Updated 6 months ago
- Universal Neurons in GPT2 Language Models☆25Updated 3 months ago
- Mechanistic Interpretability for Transformer Models☆48Updated 2 years ago
- ☆174Updated 4 months ago
- ☆57Updated 2 years ago
- ☆64Updated last month
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆34Updated last year
- we got you bro☆32Updated last month
- ☆47Updated 3 months ago
- A library to create and manage configuration files, especially for machine learning projects.☆77Updated 2 years ago
- ☆68Updated last month
- NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆33Updated last month
- ☆66Updated last month
- ☆91Updated last month
- ☆50Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆84Updated 4 months ago
- Mechanistic Interpretability Visualizations using React☆175Updated 2 months ago
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆15Updated last year
- A library for efficient patching and automatic circuit discovery.☆18Updated 3 weeks ago