KihoPark/linear_rep_geometry

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KihoPark/linear_rep_geometry)

KihoPark / linear_rep_geometry

Code for 'The Linear Representation Hypothesis and the Geometry of Large Language Models' (ICML 2024)

☆125

Alternatives and similar repositories for linear_rep_geometry

Users that are interested in linear_rep_geometry are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

KihoPark / LLM_Categorical_Hierarchical_Representations
View on GitHub
Code for 'The Geometry of Categorical and Hierarchical Concepts in Large Language Models' (ICLR 2025, Oral)
☆115Feb 11, 2025Updated last year
zihao12 / concept-algebra
View on GitHub
☆27Feb 9, 2023Updated 3 years ago
DanielSc4 / Dynamic-Activation-Composition
View on GitHub
Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"
☆14Nov 22, 2024Updated last year
explanare / ravel
View on GitHub
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆58Oct 30, 2025Updated 8 months ago
saprmarks / dictionary_learning
View on GitHub
☆427Aug 21, 2025Updated 11 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
JoshEngels / MultiDimensionalFeatures
View on GitHub
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆90Nov 27, 2024Updated last year
KihoPark / dual-steering
View on GitHub
Code for 'The Information Geometry of Softmax: Probing and Steering' (ICML 2026)
☆17May 19, 2026Updated 2 months ago
saprmarks / geometry-of-truth
View on GitHub
☆113Aug 8, 2024Updated last year
steering-vectors / steering-vectors
View on GitHub
Steering vectors for transformer language models in Pytorch / Huggingface
☆157Feb 21, 2025Updated last year
jacobdunefsky / transcoder_circuits
View on GitHub
☆212Nov 17, 2024Updated last year
safety-research / SHADE-Arena
View on GitHub
☆26Jun 22, 2025Updated last year
DLR-SC / style-vectors-for-steering-llms
View on GitHub
Code release for the paper "Style Vectors for Steering Generative Large Language Models", accepted to the Findings of the EACL 2024.
☆37Sep 26, 2024Updated last year
jkutaso / SHADE-Arena
View on GitHub
☆57May 9, 2025Updated last year
ericwtodd / function_vectors
View on GitHub
Function Vectors in Large Language Models (ICLR 2024)
☆199Apr 30, 2026Updated 2 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
cvenhoff / steering-thinking-llms
View on GitHub
☆38Jul 9, 2025Updated last year
stanfordnlp / pyvene
View on GitHub
Stanford NLP Python library for understanding and improving PyTorch models via interventions
☆892Mar 6, 2026Updated 4 months ago
cfpark00 / concept-learning
View on GitHub
Concept Learning Dynamics
☆17Oct 29, 2024Updated last year
cadentj / caft
View on GitHub
☆25Mar 30, 2026Updated 3 months ago
ndif-team / nnsight
View on GitHub
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆995Updated this week
ArthurConmy / Automatic-Circuit-Discovery
View on GitHub
☆293Oct 1, 2024Updated last year
chrisliu298 / awesome-representation-engineering
View on GitHub
A resource repository for representation engineering in large language models
☆156Nov 14, 2024Updated last year
GraySwanAI / circuit-breakers
View on GitHub
Improving Alignment and Robustness with Circuit Breakers
☆266Sep 24, 2024Updated last year
ruizheliUOA / ARC_JSD
View on GitHub
A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation
☆15Aug 28, 2025Updated 10 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
openai / sparse_autoencoder
View on GitHub
☆596Jul 19, 2024Updated 2 years ago
andyzoujm / representation-engineering
View on GitHub
Representation Engineering: A Top-Down Approach to AI Transparency
☆1,013Aug 14, 2024Updated last year
TransformerLensOrg / CircuitsVis
View on GitHub
Mechanistic Interpretability Visualizations using React
☆358Apr 30, 2026Updated 2 months ago
rgreenblatt / model_organism_public
View on GitHub
☆15Jun 17, 2025Updated last year
decoderesearch / SAELens
View on GitHub
Training Sparse Autoencoders on Language Models
☆1,483Updated this week
p-lambda / incontext-learning
View on GitHub
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…
☆108Nov 10, 2023Updated 2 years ago
saprmarks / feature-circuits
View on GitHub
☆223Oct 14, 2025Updated 9 months ago
abhishekpanigrahi1996 / transformer_in_transformer
View on GitHub
☆47Oct 11, 2023Updated 2 years ago
neelnanda-io / Crosscoders
View on GitHub
☆60Nov 19, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
nrimsky / CAA
View on GitHub
Steering Llama 2 with Contrastive Activation Addition
☆240May 23, 2024Updated 2 years ago
EleutherAI / sparsify
View on GitHub
Sparsify transformers with SAEs and transcoders
☆733Updated this week
andyrdt / refusal_direction
View on GitHub
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".
☆423Jun 13, 2025Updated last year
collin-burns / discovering_latent_knowledge
View on GitHub
☆287Mar 2, 2024Updated 2 years ago
javiferran / sae_entities
View on GitHub
☆78Mar 6, 2025Updated last year
JoshEngels / SAE-Dark-Matter
View on GitHub
Code for our paper "Decomposing The Dark Matter of Sparse Autoencoders"
☆23Feb 6, 2025Updated last year
montemac / activation_additions
View on GitHub
Algebraic value editing in pretrained language models
☆71Nov 1, 2023Updated 2 years ago