princeton-nlp/Edge-Pruning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/princeton-nlp/Edge-Pruning)

princeton-nlp / Edge-Pruning

[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".

☆70

Alternatives and similar repositories for Edge-Pruning

Users that are interested in Edge-Pruning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hannamw / EAP-IG
View on GitHub
☆84May 23, 2026Updated 2 months ago
Aaquib111 / edge-attribution-patching
View on GitHub
Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"
☆48May 31, 2024Updated 2 years ago
efarrell1 / train_sparse_autoencoder
View on GitHub
Trains Sparse Autoencoders based on outputs from language models
☆11Oct 7, 2024Updated last year
jiahai-feng / binding-iclr
View on GitHub
☆19Mar 5, 2024Updated 2 years ago
shawnricecake / search-llm
View on GitHub
[NeurIPS 2024] Search for Efficient LLMs
☆16Jan 16, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
MikaStars39 / FeatureAlignment
View on GitHub
FeatureAlignment = Alignment + Mechanistic Interpretability
☆35Mar 8, 2025Updated last year
FlyingPumba / InterpBench
View on GitHub
A benchmark for mechanistic discovery of circuits in Transformers
☆17Dec 15, 2024Updated last year
hannamw / gpt2-greater-than
View on GitHub
Code Release for the 2023 NeurIPS Paper How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained langua…
☆17Dec 6, 2024Updated last year
JasonGross / guarantees-based-mechanistic-interpretability
View on GitHub
☆18Jul 21, 2026Updated last week
swei2001 / RouteSAEs
View on GitHub
☆15Jan 2, 2026Updated 6 months ago
Butanium / tiny-activation-dashboard
View on GitHub
A tiny easily hackable implementation of a feature dashboard.
☆17Oct 21, 2025Updated 9 months ago
adamkarvonen / dictionary_learning_demo
View on GitHub
☆26Aug 23, 2025Updated 11 months ago
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
fjzzq2002 / pizza
View on GitHub
Code repository for "The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks"
☆20Nov 24, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
OSU-NLP-Group / reversal-curse-binding
View on GitHub
☆25Apr 3, 2025Updated last year
real-absolute-AI / Unnatural_Language
View on GitHub
The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'
☆24May 20, 2025Updated last year
tim-lawson / mlsae
View on GitHub
Multi-Layer Sparse Autoencoders (ICLR 2025)
☆30Feb 6, 2026Updated 5 months ago
msakarvadia / memorization
View on GitHub
Localizing Memorized Sequences in Language Models
☆22Oct 15, 2025Updated 9 months ago
EleutherAI / elk-generalization
View on GitHub
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…
☆33May 23, 2024Updated 2 years ago
Helsinki-NLP / OPUS-MT-testsets
View on GitHub
benchmarks for evaluating MT models
☆11Jun 26, 2024Updated 2 years ago
locuslab / massive-activations
View on GitHub
Code accompanying the paper "Massive Activations in Large Language Models"
☆202Mar 4, 2024Updated 2 years ago
bilal-chughtai / rep-theory-mech-interp
View on GitHub
☆31May 4, 2023Updated 3 years ago
neelnanda-io / Grokking
View on GitHub
A Mechanistic Interpretability Analysis of Grokking
☆29Sep 26, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆1,005Updated this week
Princeton-SysML / kNNLM_privacy
View on GitHub
Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888
☆37Jun 10, 2024Updated 2 years ago
zwhe99 / LLM-MT-Eval
View on GitHub
{DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}
☆14Jun 18, 2023Updated 3 years ago
teobaluta / etio
View on GitHub
Causal Reasoning for Membership Inference Attacks
☆11Oct 21, 2022Updated 3 years ago
matchten / LoRA-Models-for-SAEs
View on GitHub
Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"
☆17Mar 31, 2025Updated last year
Abonia1 / yolosegment2labelme
View on GitHub
yolosegment2labelme - a Python package that allows you to convert YOLO segmentation prediction results to LabelMe and anylabeling JSON fo…
☆10May 8, 2024Updated 2 years ago
s-ball-10 / jailbreak_dynamics
View on GitHub
☆25Jun 13, 2024Updated 2 years ago
Phylliida / MambaLens
View on GitHub
Mamba support for transformer lens
☆20Sep 17, 2024Updated last year
dsp444 / save_canvas_discussion
View on GitHub
Tool to convert JSON formatted discussion posts on Canvas LMS into HTML files - similar to saving student text-entry assignments
☆13May 20, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
xie-lab-ml / Meissonic-Inference
View on GitHub
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
☆16Nov 21, 2024Updated last year
pprp / Pruner-Zero
View on GitHub
[ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
☆100Nov 25, 2024Updated last year
Dakingrai / awesome-mechanistic-interpretability-lm-papers
View on GitHub
☆260Nov 22, 2024Updated last year
HumanCompatibleAI / leela-interp
View on GitHub
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
☆31Jun 4, 2024Updated 2 years ago
varunnair18 / FISH
View on GitHub
Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).
☆59Jan 14, 2022Updated 4 years ago
OpenMOSS / Llamascopium
View on GitHub
Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.
☆225Jul 22, 2026Updated last week
harbor-framework / harbor-index
View on GitHub
A compact high-signal benchmark for evaluating frontier agents
☆21Updated this week