zepingyu0512/neuron-attribution

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zepingyu0512/neuron-attribution)

zepingyu0512 / neuron-attribution

code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models

☆52

Alternatives and similar repositories for neuron-attribution

Users that are interested in neuron-attribution are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zepingyu0512 / in-context-mechanism
View on GitHub
code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…
☆13Nov 17, 2024Updated last year
zepingyu0512 / arithmetic-mechanism
View on GitHub
code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
☆12Nov 17, 2024Updated last year
LLM-MI-Research / Actionable-MI
View on GitHub
☆15Jan 20, 2026Updated 6 months ago
technion-cs-nlp / hallucination-mitigation
View on GitHub
☆23Dec 17, 2024Updated last year
zepingyu0512 / awesome-llm-understanding-mechanism
View on GitHub
awesome papers in LLM interpretability
☆623Aug 20, 2025Updated 11 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
allenai / hyperdecoders
View on GitHub
Codebase for Hyperdecoders https://arxiv.org/abs/2203.08304
☆14Oct 11, 2022Updated 3 years ago
THU-KEG / Event-Level-Knowledge-Editing
View on GitHub
☆12Apr 25, 2024Updated 2 years ago
cooperleong00 / Awesome-LLM-Interpretability
View on GitHub
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆308Jan 22, 2026Updated 5 months ago
redwoodresearch / Easy-Transformer
View on GitHub
☆148Aug 4, 2024Updated last year
ekinakyurek / influence
View on GitHub
Code for "Tracing Knowledge in Language Models Back to the Training Data"
☆40Dec 27, 2022Updated 3 years ago
paul-rottger / msts-multimodal-safety
View on GitHub
Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"
☆20Mar 31, 2025Updated last year
MANGA-UOFA / PTfer
View on GitHub
☆11Nov 13, 2024Updated last year
ruizheliUOA / ARC_JSD
View on GitHub
A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation
☆15Aug 28, 2025Updated 10 months ago
RUCAIBox / Language-Specific-Neurons
View on GitHub
☆91Dec 23, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Trae1ounG / Pretrain_Space_RLVR
View on GitHub
[arxiv: 2604.14142] From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
☆17Apr 16, 2026Updated 3 months ago
Trustworthy-Information-Access / LLM-Knowledge-Boundary-Perception-via-Internal-States
View on GitHub
Official code for the paper Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. The code is based on t…
☆22Aug 5, 2025Updated 11 months ago
zepingyu0512 / awesome-LLM-neuron
View on GitHub
☆36Jun 13, 2025Updated last year
yjywdzh / ACE
View on GitHub
This repository refers to the codes of paper ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
☆15Jan 31, 2026Updated 5 months ago
HoagyC / sparse_coding
View on GitHub
Using sparse coding to find distributed representations used by neural networks.
☆306Nov 10, 2023Updated 2 years ago
saprmarks / feature-circuits
View on GitHub
☆223Oct 14, 2025Updated 9 months ago
UmeanNever / NovelSum
View on GitHub
[ACL 2025 Main] Official Repo for Paper "Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric"
☆42Feb 10, 2026Updated 5 months ago
gangiswag / infogent
View on GitHub
☆24Mar 1, 2025Updated last year
ablghtianyi / ICL_Modular_Arithmetic
View on GitHub
☆19Mar 25, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
machinelearning4health / TextHoaxer
View on GitHub
Implementation Code of TextHoaxer
☆15Aug 21, 2022Updated 3 years ago
MrBlankness / TPO
View on GitHub
Pytorch implementation of Tree Preference Optimization (TPO) (Accepted by ICLR'25)
☆28Apr 24, 2025Updated last year
shuyhere / Awesome-Sparse-Autoencoder
View on GitHub
Collection of Reverse Engineering in Large Model
☆35Jan 8, 2025Updated last year
NYUSHCS / UniGLM
View on GitHub
☆15Jul 5, 2024Updated 2 years ago
tdooms / bilinear-decomposition
View on GitHub
Official repo for the paper "Bilinear MLPs enable weight-based mechanistic interpretability".
☆42Jun 2, 2026Updated last month
zjunlp / Kformer
View on GitHub
[NLPCC 2022] Kformer: Knowledge Injection in Transformer Feed-Forward Layers
☆39Oct 20, 2022Updated 3 years ago
NEUIR / P-ALIGN
View on GitHub
[ACL '26] source code for the paper: "Long-Chain Reasoning Distillation via Adaptive Prefix Alignment"
☆16Jan 21, 2026Updated 6 months ago
acl-org / arr-health
View on GitHub
Monitoring the health of ARR
☆32Apr 4, 2026Updated 3 months ago
Glaciohound / LM-Steer
View on GitHub
Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)
☆149Jul 13, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
interpretingdl / eacl2024_transformer_interpretability_tutorial
View on GitHub
Materials for EACL2024 tutorial: Transformer-specific Interpretability
☆66Mar 26, 2024Updated 2 years ago
AngelaZZZ-611 / reasoning_models_probing
View on GitHub
☆21May 14, 2026Updated 2 months ago
ApolloResearch / e2e_sae
View on GitHub
Sparse Autoencoder Training Library
☆58May 1, 2025Updated last year
zjunlp / KnowledgeCircuits
View on GitHub
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
☆172Nov 14, 2025Updated 8 months ago
baponkar / Third-Person-Shooter
View on GitHub
Third Person Shooter for Unity
☆13Jun 26, 2022Updated 4 years ago
heng840 / AMIG
View on GitHub
Code of Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Ne…
☆28Mar 19, 2024Updated 2 years ago
yihuaihong / ConceptVectors
View on GitHub
[EMNLP 2025 Main] ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"
☆40Aug 20, 2025Updated 11 months ago