mishajw/repeng

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mishajw/repeng)

mishajw / repeng

Experiments with representation engineering

☆14

Alternatives and similar repositories for repeng

Users that are interested in repeng are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kxcloud / gradient-routing
View on GitHub
☆11Dec 4, 2024Updated last year
koayon / atp_star
View on GitHub
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Jan 19, 2025Updated last year
LRudL / evalugator
View on GitHub
(Model-written) LLM evals library
☆19Dec 13, 2024Updated last year
Jiaxin-Wen / MisleadLM
View on GitHub
Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""
☆20Oct 11, 2024Updated last year
tripos-education / maths-tripos-questions
View on GitHub
Archive of questions from the Cambridge Mathematics Tripos
☆10Jun 6, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Cadenza-Labs / sleeper-agents
View on GitHub
☆15Jul 12, 2024Updated 2 years ago
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆29Nov 20, 2024Updated last year
ApolloResearch / sample
View on GitHub
Repository with sample code using Apollo's suggested engineering practices
☆15Dec 16, 2024Updated last year
LiuAmber / RAHF
View on GitHub
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆28Sep 25, 2024Updated last year
dtch1997 / steering-bench
View on GitHub
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
☆22Dec 14, 2024Updated last year
understanding-search / maze-transformer
View on GitHub
This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.
☆35Oct 28, 2025Updated 8 months ago
annahdo / implementing_activation_steering
View on GitHub
A collection of different ways to implement accessing and modifying internal model activations for LLMs
☆24Oct 18, 2024Updated last year
FatemehShiri / Spatial-MM
View on GitHub
☆12Jan 10, 2025Updated last year
ArthurConmy / Automatic-Circuit-Discovery
View on GitHub
☆293Oct 1, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
UFO-101 / auto-circuit
View on GitHub
A library for efficient patching and automatic circuit discovery.
☆99Dec 31, 2025Updated 6 months ago
GokuMohandas / follow
View on GitHub
☆11Oct 15, 2021Updated 4 years ago
IBM / sae-steering
View on GitHub
Code to enable layer-level steering in LLMs using sparse auto encoders
☆34Sep 18, 2025Updated 10 months ago
callummcdougall / path_patching
View on GitHub
Implementation of path patching & activation patching (will eventually add to TransformerLens).
☆15Jan 8, 2024Updated 2 years ago
exoji2e / hashcode-template
View on GitHub
☆16Feb 24, 2022Updated 4 years ago
ArthurConmy / MishformerLens
View on GitHub
MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing…
☆10Oct 7, 2024Updated last year
neelnanda-io / 1L-Sparse-Autoencoder
View on GitHub
☆141Oct 28, 2023Updated 2 years ago
HumanCompatibleAI / leela-interp
View on GitHub
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
☆31Jun 4, 2024Updated 2 years ago
vedantpalit / Towards-Vision-Language-Mechanistic-Interpretability
View on GitHub
This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…
☆25Feb 16, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tegg89 / Deep-blogs
View on GitHub
A curated lists of self-taught materials including research blogs
☆16Dec 12, 2016Updated 9 years ago
benthecoder / AI
View on GitHub
learning AI from scratch
☆14Feb 17, 2024Updated 2 years ago
simplelifetime / TIVE
View on GitHub
Less is More: High-value Data Selection for Visual Instruction Tuning
☆20Jan 18, 2025Updated last year
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
tamangmilan / llama3
View on GitHub
Building Llama 3 from scratch using PyTorch
☆13Sep 1, 2024Updated last year
alignedai / HappyFaces
View on GitHub
The Happy Faces Benchmark
☆15Jul 20, 2023Updated 3 years ago
gabrielpreda / generative_ai
View on GitHub
Kaggle Notebooks, Utility Scripts using Generative AI tools to check new models, fine tune models, test with various prompts, create Retr…
☆18Mar 8, 2026Updated 4 months ago
ssbuild / aigc_evals
View on GitHub
aigc evals
☆10Dec 2, 2023Updated 2 years ago
smoreira / MultiLayerPerceptron
View on GitHub
MLP with Tensorflow and IRIS Dataset
☆10Feb 15, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
andrewschreiber / agent
View on GitHub
Interpretability dashboard for reinforcement learners
☆16Jun 4, 2019Updated 7 years ago
nunorc / squad-v1.1-pt
View on GitHub
Portuguese translation of the SQuAD dataset
☆19Oct 22, 2020Updated 5 years ago
Breakend / SelfDestructingModels
View on GitHub
☆14Aug 9, 2023Updated 2 years ago
kivancgunduz / expiration-date-detection
View on GitHub
An API that detect expiration date from the product package's picture based on Deep Learning Algorithms
☆11Jun 4, 2022Updated 4 years ago
lpjiang97 / dynamic-predictive-coding
View on GitHub
Code accompanying "Dynamic Predictive Coding: A Model of Hierarchical Sequence Learning and Prediction in the Neocortex"
☆10Mar 2, 2025Updated last year
dongjunKANG / VIM
View on GitHub
☆11Oct 16, 2023Updated 2 years ago
claudia-viaro / Wdss-UCLdss_research
View on GitHub
☆12Aug 31, 2022Updated 3 years ago