FOR-sight-ai / interpretoLinks
πͺ Interpreto is an interpretability toolbox for LLMs
β61Updated last week
Alternatives and similar repositories for interpreto
Users that are interested in interpreto are comparing it to the libraries listed below
Sorting:
- π Overcomplete is a Vision-based SAE Toolboxβ101Updated 2 weeks ago
- π CODS - Conformal Object Detection and Segmentationβ19Updated 3 weeks ago
- π Influenciae is a Tensorflow Toolbox for Influence Functionsβ64Updated last year
- Build and train Lipschitz constrained networks: TensorFlow implementation of k-Lipschitz layersβ100Updated 8 months ago
- β38Updated 2 months ago
- Simple, compact, and hackable post-hoc deep OOD detection for already trained tensorflow or pytorch image classifiers.β60Updated last month
- β53Updated last year
- Build and train Lipschitz-constrained networks: PyTorch implementation of 1-Lipschitz layers. For TensorFlow/Keras implementation, see htβ¦β35Updated 3 weeks ago
- Reliable, minimal and scalable library for pretraining foundation and world modelsβ90Updated this week
- New implementations of old orthogonal layers unlock large scale training.β23Updated 2 months ago
- Repository for PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits, accepted at CVPR 2024 XAI4CV Worksβ¦β19Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)β105Updated last month
- Tools for optimizing steering vectors in LLMs.β14Updated 7 months ago
- [CVPRW 2024] Conformal prediction for uncertainty quantification in image segmentationβ26Updated 11 months ago
- π Code for : "CRAFT: Concept Recursive Activation FacTorization for Explainability" (CVPR 2023)β70Updated 2 years ago
- Sparse and discrete interpretability tool for neural networksβ64Updated last year
- Cost aware hyperparameter tuning algorithmβ173Updated last year
- Interpreting how transformers simulate agents performing RL tasksβ88Updated 2 years ago
- A TinyStories LM with SAEs and transcodersβ13Updated 7 months ago
- A projection-based framework for gradient-free and parallel learningβ106Updated 5 months ago
- Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Mβ¦β29Updated last year
- LENS Projectβ51Updated last year
- β24Updated 11 months ago
- DiffuLab is designed to provide a simple and flexible way to train diffusion models while allowing full customization of its core componeβ¦β40Updated last week
- β119Updated 5 months ago
- Deep Networks Grok All the Time and Here is Whyβ37Updated last year
- Sparse Autoencoder Training Libraryβ55Updated 6 months ago
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok β¦β27Updated 2 weeks ago
- Comparison between GFlowNets & Maximum Entropy RLβ19Updated last year
- Attribution-based Parameter Decompositionβ31Updated 5 months ago