Ybakman / TruthTorchLMLinks
☆59Updated 2 months ago
Alternatives and similar repositories for TruthTorchLM
Users that are interested in TruthTorchLM are comparing it to the libraries listed below
Sorting:
- This repository collects all relevant resources about interpretability in LLMs☆391Updated last year
- ☆143Updated last month
- ☆230Updated last year
- Conformal Language Modeling☆31Updated 2 years ago
- AI Logging for Interpretability and Explainability🔬☆140Updated last year
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆79Updated last year
- [NeurIPS D&B '25] The one-stop repository for LLM unlearning☆479Updated last month
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆163Updated 7 months ago
- Using sparse coding to find distributed representations used by neural networks.☆293Updated 2 years ago
- [ICLR 2025] General-purpose activation steering library☆141Updated 4 months ago
- ☆197Updated last year
- Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).☆404Updated last year
- A fast, effective data attribution method for neural networks in PyTorch☆229Updated last year
- ☆432Updated last week
- Steering Llama 2 with Contrastive Activation Addition☆207Updated last year
- Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization☆41Updated last year
- ☆58Updated 2 years ago
- Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature☆178Updated 7 months ago
- Persona Vectors: Monitoring and Controlling Character Traits in Language Models☆348Updated 6 months ago
- Python package for measuring memorization in LLMs.☆184Updated 6 months ago
- Sparse Autoencoder for Mechanistic Interpretability☆290Updated last year
- A resource repository for representation engineering in large language models☆148Updated last year
- ☆103Updated last year
- ☆34Updated last year
- ☆184Updated last year
- ☆42Updated 2 years ago
- Tools for optimizing steering vectors in LLMs.☆19Updated 10 months ago
- ☆241Updated last year
- Unified access to Large Language Model modules using NNsight☆88Updated this week
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆198Updated 11 months ago