cadentj/caft

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cadentj/caft)

cadentj / caft

☆25

Alternatives and similar repositories for caft

Users that are interested in caft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tim-hua-01 / steering-eval-awareness-public
View on GitHub
☆17Mar 16, 2026Updated 4 months ago
FarnoushRJ / RelP
View on GitHub
[NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…
☆29Nov 3, 2025Updated 8 months ago
safety-research / false-facts
View on GitHub
☆50Jul 4, 2025Updated last year
science-of-finetuning / diffing-toolkit
View on GitHub
A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.
☆78Updated this week
aypan17 / latentqa
View on GitHub
☆34Nov 16, 2025Updated 8 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ajobi-uhc / seer
View on GitHub
This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …
☆146Feb 8, 2026Updated 5 months ago
HLTCHKUST / UniVaR
View on GitHub
Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)
☆23Jul 9, 2024Updated 2 years ago
IINemo / llm-uncertainty-head
View on GitHub
☆26Feb 23, 2026Updated 5 months ago
aisa-group / decomposing-eval-awareness
View on GitHub
Decomposing and measuring evaluation awareness in existing benchmarks and our proposed EvalAwareBench.
☆19Jun 1, 2026Updated last month
aaronmueller / MIB
View on GitHub
Landing page for MIB: A Mechanistic Interpretability Benchmark
☆26Aug 15, 2025Updated 11 months ago
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
UFO-101 / auto-circuit
View on GitHub
A library for efficient patching and automatic circuit discovery.
☆99Dec 31, 2025Updated 6 months ago
TransluceAI / circuits
View on GitHub
ADAG: Transluce's MLP neuron-level circuit tracing library
☆34Apr 10, 2026Updated 3 months ago
jettjaniak / chainscope
View on GitHub
Repository for the "Chain-of-Thought Reasoning In The Wild Is Not Always Faithful" paper
☆35Mar 31, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
seraphlabs-ca / SentenceMIM-demo
View on GitHub
This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"
☆28Jun 22, 2022Updated 4 years ago
safety-research / safety-tooling
View on GitHub
Inference API for many LLMs and other useful tools for empirical research
☆134May 29, 2026Updated last month
TemporaryLoRA / FreeLM
View on GitHub
☆15Feb 10, 2026Updated 5 months ago
McGill-NLP / latent-translation
View on GitHub
Code for the paper "Modelling Latent Translations for Cross-Lingual Transfer"
☆17Nov 22, 2021Updated 4 years ago
KihoPark / linear_rep_geometry
View on GitHub
Code for 'The Linear Representation Hypothesis and the Geometry of Large Language Models' (ICML 2024)
☆125Feb 11, 2025Updated last year
goodfire-ai / scribe-task-suite
View on GitHub
A suite of interpretability tasks to evaluate agents using Scribe for notebook access
☆18Oct 2, 2025Updated 9 months ago
nickkeesG / Pantheon
View on GitHub
Experimental LLM interface exploring new ways to use AI to improve human thinking
☆21Apr 13, 2026Updated 3 months ago
yc015 / TalkTuner-chatbot-llm-dashboard
View on GitHub
Designing a Dashboard for Transparency and Control of Conversational AI, https://arxiv.org/abs/2406.07882
☆39Oct 7, 2025Updated 9 months ago
lkopf / cosy
View on GitHub
[NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.
☆20Jan 28, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
mitvis / saliency-cards
View on GitHub
Saliency Cards are transparency documentation for saliency methods. Learn about new saliency methods or document your own!
☆19Jun 9, 2023Updated 3 years ago
goodfire-ai / scribe
View on GitHub
☆86Feb 18, 2026Updated 5 months ago
JasonGross / guarantees-based-mechanistic-interpretability
View on GitHub
☆18Updated this week
andyrdt / refusal_direction
View on GitHub
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".
☆424Jun 13, 2025Updated last year
wbopan / safety-residual-space
View on GitHub
Multi-dimensional analysis of orthogonal safety directions in LLM alignment
☆23Jun 12, 2026Updated last month
TeunvdWeij / sandbagging
View on GitHub
☆20Nov 15, 2024Updated last year
amack315 / unsupervised-steering-vectors
View on GitHub
☆38Apr 30, 2024Updated 2 years ago
EleutherAI / bergson
View on GitHub
Mapping out the "memory" of neural nets with data attribution
☆70Updated this week
microsoft / implicitMemory
View on GitHub
☆19Feb 12, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
VITA-Group / MaT-FL
View on GitHub
[CVPRW 2023] "Many-Task Federated Learning: A New Problem Setting and A Simple Baseline" by Ruisi Cai, Xiaohan Chen, Shiwei Liu, Jayanth …
☆13Aug 28, 2023Updated 2 years ago
EleutherAI / tokengrams
View on GitHub
Efficiently computing & storing token n-grams from large corpora
☆28Jun 15, 2026Updated last month
wangrongding / folder-print
View on GitHub
🌿快速生成文件夹目录结构，支持定义目录层级，支持生成到 markdown 文件。
☆13Oct 19, 2022Updated 3 years ago
koayon / atp_star
View on GitHub
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Jan 19, 2025Updated last year
interp-reasoning / thought-anchors
View on GitHub
⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.
☆137Oct 27, 2025Updated 8 months ago
cvenhoff / steering-thinking-llms
View on GitHub
☆39Jul 9, 2025Updated last year
ApolloResearch / deception-detection
View on GitHub
☆44Feb 11, 2025Updated last year