πͺ Interpreto is an interpretability toolbox for LLMs
β144Feb 25, 2026Updated this week
Alternatives and similar repositories for interpreto
Users that are interested in interpreto are comparing it to the libraries listed below
Sorting:
- π Overcomplete is a Vision-based SAE Toolboxβ123Dec 4, 2025Updated 2 months ago
- π Influenciae is a Tensorflow Toolbox for Influence Functionsβ66Updated this week
- β14May 6, 2025Updated 9 months ago
- Generic Engine for Multi-disciplinary Scenarios, Exploration and Optimization. This is a MIRROR of our gitlab repository, the developmentβ¦β31Updated this week
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorchβ10Aug 7, 2024Updated last year
- DL Backtrace is a new explainablity technique for deep learning models that works for any modality and model type.β23Feb 16, 2026Updated 2 weeks ago
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementingβ¦β10Oct 7, 2024Updated last year
- A library for training crosscodersβ16May 28, 2025Updated 9 months ago
- β16Apr 7, 2025Updated 10 months ago
- Tools for optimizing steering vectors in LLMs.β20Apr 10, 2025Updated 10 months ago
- A tiny easily hackable implementation of a feature dashboard.β15Oct 21, 2025Updated 4 months ago
- β17Aug 30, 2025Updated 6 months ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.β31Apr 22, 2025Updated 10 months ago
- β17Jul 9, 2025Updated 7 months ago
- DiffuLab is designed to provide a simple and flexible way to train diffusion models while allowing full customization of its core componeβ¦β43Jan 11, 2026Updated last month
- Attribution-based Parameter Decompositionβ34Jun 11, 2025Updated 8 months ago
- β15Jan 2, 2023Updated 3 years ago
- Build and train Lipschitz constrained networks: TensorFlow implementation of k-Lipschitz layersβ102Mar 14, 2025Updated 11 months ago
- π CODS - Conformal Object Detection and Segmentationβ20Dec 15, 2025Updated 2 months ago
- An easy to use tool to apply adversarial attacksβ12Aug 9, 2024Updated last year
- β39Sep 15, 2025Updated 5 months ago
- Engine for collecting, uploading, and downloading model activationsβ26Apr 2, 2025Updated 11 months ago
- A Numpy implementation of a Generative Adversarial Network.β17Sep 4, 2020Updated 5 years ago
- π¬ Interpretability for Leela Chess Zero networks.β19Nov 17, 2025Updated 3 months ago
- β17Updated this week
- New implementations of old orthogonal layers unlock large scale training.β29Sep 19, 2025Updated 5 months ago
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok β¦β30Dec 8, 2025Updated 2 months ago
- BM-MAE: Multimodal Masked Autoencoder Pre-training for 3D MRI-based Brain Tumor Analysis with Missing Modalitiesβ29Aug 24, 2025Updated 6 months ago
- experiments with d3 and r integration with shinyβ36Dec 7, 2012Updated 13 years ago
- Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversationβ35Mar 3, 2025Updated 11 months ago
- LENS Projectβ52Feb 22, 2024Updated 2 years ago
- Unified access to Large Language Model modules using NNsightβ93Updated this week
- Comprehensive Python Plotly tutorial & cheat sheet. Covers plotly.express, graph_objects & figure_factory for Data Science, 3D plotting, β¦β22Dec 3, 2025Updated 2 months ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/β26Mar 10, 2025Updated 11 months ago
- π Puncc is a python library for predictive uncertainty quantification using conformal prediction.β372Updated this week
- Multi-Layer Sparse Autoencoders (ICLR 2025)β29Feb 6, 2026Updated 3 weeks ago
- Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]β222Jul 11, 2025Updated 7 months ago
- β150Dec 30, 2025Updated 2 months ago
- β10Nov 12, 2024Updated last year