aypan17/latentqa

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aypan17/latentqa)

aypan17 / latentqa

☆34

Alternatives and similar repositories for latentqa

Users that are interested in latentqa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tiremoscode / dw-grupo58
View on GitHub
☆20Nov 28, 2024Updated last year
buyi-Yang / getQzonehistory
View on GitHub
☆12Nov 13, 2024Updated last year
abduvalimurodullayev1 / boilerplate_Drf
View on GitHub
This is the boilerplate for django project. There are so many settings configurations
☆10Nov 7, 2025Updated 8 months ago
noanabeshima / matryoshka-saes
View on GitHub
☆33Nov 28, 2024Updated last year
zhqwqwq / Learning-Parity-with-CoT
View on GitHub
[ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…
☆12Mar 7, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆29Nov 20, 2024Updated last year
Trustworthy-ML-Lab / CB-LLMs
View on GitHub
[ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliabilit…
☆33Feb 5, 2026Updated 5 months ago
cadentj / caft
View on GitHub
☆25Mar 30, 2026Updated 3 months ago
hgabor / nestjs-keret-2024
View on GitHub
NestJS project template, configured with prisma and ejs
☆12Dec 1, 2024Updated last year
Model-GLUE / Model-GLUE
View on GitHub
☆18Aug 19, 2024Updated last year
explanare / ravel
View on GitHub
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆58Oct 30, 2025Updated 8 months ago
LLM-MI-Research / Actionable-MI
View on GitHub
☆15Jan 20, 2026Updated 6 months ago
tmlr-group / G-effect
View on GitHub
[ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"
☆16Feb 27, 2025Updated last year
samyadeepbasu / LocoGen
View on GitHub
Localization of Knowledge in Text-to-Image Models
☆11Oct 8, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zepingyu0512 / neuron-attribution
View on GitHub
code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models
☆52Nov 17, 2024Updated last year
tonychenxyz / selfie
View on GitHub
This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…
☆58Dec 9, 2024Updated last year
cywinski / eliciting-secret-knowledge
View on GitHub
Code repository for "Eliciting Secret Knowledge from Language Models"
☆24Mar 30, 2026Updated 3 months ago
saprmarks / geometry-of-truth
View on GitHub
☆114Aug 8, 2024Updated last year
sail-sg / Rigging-ChatbotArena
View on GitHub
Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)
☆27Feb 25, 2025Updated last year
LYang-666 / TRGP
View on GitHub
[ICLR 2022] Official Code Repository for "TRGP: TRUST REGION GRADIENT PROJECTION FOR CONTINUAL LEARNING"
☆22Oct 5, 2022Updated 3 years ago
cvenhoff / thinking-llms-interp
View on GitHub
☆25Jul 8, 2026Updated 2 weeks ago
jorispos / ConceptorSteering
View on GitHub
☆16Mar 13, 2025Updated last year
jiahai-feng / binding-iclr
View on GitHub
☆19Mar 5, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
adamkarvonen / SAEBench
View on GitHub
☆178May 1, 2026Updated 2 months ago
francescortu / comp-mech
View on GitHub
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals; ACL 2024
☆13May 24, 2024Updated 2 years ago
dtch1997 / steering-bench
View on GitHub
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
☆22Dec 14, 2024Updated last year
AndreaGrandieri / ing-sw-2024-codex-naturalis
View on GitHub
Progetto per la prova finale di Ingegneria del Software 2023-2024 al Politecnico di Milano
☆10Oct 19, 2024Updated last year
Dakingrai / awesome-mechanistic-interpretability-lm-papers
View on GitHub
☆260Nov 22, 2024Updated last year
dragonjsq / -VPN
View on GitHub
免费梯子，免费VPN，真正免费的的VPN，shadowsocks,v2rey,官网地址www.dragonvpn.cc
☆13Sep 4, 2024Updated last year
adamkarvonen / SAE_BoardGameEval
View on GitHub
☆25Jan 28, 2025Updated last year
ZFancy / awesome-activation-engineering
View on GitHub
A curated list of resources for activation engineering
☆140Oct 2, 2025Updated 9 months ago
wesg52 / universal-neurons
View on GitHub
Universal Neurons in GPT2 Language Models
☆30May 28, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OPTML-Group / Unlearn-Smooth
View on GitHub
[ICML25] Official repo for "Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond…
☆24Sep 27, 2025Updated 9 months ago
mchiquier / llm-mutate
View on GitHub
☆15Oct 7, 2024Updated last year
stanfordnlp / axbench
View on GitHub
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆210Mar 12, 2026Updated 4 months ago
locuslab / acr-memorization
View on GitHub
☆41Dec 19, 2024Updated last year
victorssilva / concreteness
View on GitHub
Concreteness
☆20Nov 22, 2022Updated 3 years ago
zeyuyun1 / TransformerVis
View on GitHub
☆43Nov 16, 2021Updated 4 years ago
wbopan / safety-residual-space
View on GitHub
Multi-dimensional analysis of orthogonal safety directions in LLM alignment
☆23Jun 12, 2026Updated last month