hannamw / MIB-circuit-trackView external linksLinks
☆23Jun 30, 2025Updated 7 months ago
Alternatives and similar repositories for MIB-circuit-track
Users that are interested in MIB-circuit-track are comparing it to the libraries listed below
Sorting:
- ☆31Updated this week
- Landing page for MIB: A Mechanistic Interpretability Benchmark☆24Aug 15, 2025Updated 6 months ago
- ☆71Jul 24, 2025Updated 6 months ago
- ☆17Aug 30, 2025Updated 5 months ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- Open source replication of Anthropic's Crosscoders for Model Diffing☆64Oct 27, 2024Updated last year
- A benchmark for mechanistic discovery of circuits in Transformers☆16Dec 15, 2024Updated last year
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆47May 31, 2024Updated last year
- Engine for collecting, uploading, and downloading model activations☆26Apr 2, 2025Updated 10 months ago
- A library for efficient patching and automatic circuit discovery.☆90Dec 31, 2025Updated last month
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- A library for mechanistic anomaly detection☆22Jan 9, 2025Updated last year
- ☆20Apr 10, 2025Updated 10 months ago
- Evaluation code and data for "Automatic Correction of Human Translations" [NAACL 2022].☆19Dec 9, 2022Updated 3 years ago
- graphpatch is a library for activation patching on PyTorch neural network models.☆20Feb 11, 2025Updated last year
- ☆25Apr 18, 2025Updated 9 months ago
- [NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers☆21May 16, 2023Updated 2 years ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆30Oct 27, 2025Updated 3 months ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆29Feb 6, 2026Updated last week
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆30Apr 22, 2025Updated 9 months ago
- Attribution-based Parameter Decomposition☆33Jun 11, 2025Updated 8 months ago
- Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"☆31Apr 28, 2023Updated 2 years ago
- ☆27Jun 12, 2023Updated 2 years ago
- ☆83Feb 25, 2025Updated 11 months ago
- 🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models☆12May 30, 2025Updated 8 months ago
- ☆51Oct 23, 2023Updated 2 years ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 8 months ago
- ☆14Apr 29, 2025Updated 9 months ago
- Spark projects. Learning book "Machine Learning with Spark"☆10Jun 3, 2017Updated 8 years ago
- Implementation of Implicit Reparameterization Trick☆11Dec 2, 2024Updated last year
- Typeclass for array types☆19Apr 7, 2025Updated 10 months ago
- NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆43Feb 12, 2025Updated last year
- Basic plotting of tabular data for the command line.☆13Apr 14, 2022Updated 3 years ago
- An application to visualize the semantic distance between two words using the Wordnet lexical database and algorithmic path finding.☆11Jan 18, 2021Updated 5 years ago
- PERM GaussianKG☆10Nov 24, 2021Updated 4 years ago
- Bayesian scaling laws for in-context learning.☆15Mar 12, 2025Updated 11 months ago
- ☆13Dec 11, 2020Updated 5 years ago
- Automated Testing and Package Uploading☆12Oct 3, 2018Updated 7 years ago
- Creating user interfaces for data science with Jupyter widgets☆11Oct 28, 2017Updated 8 years ago