IBM / AISteer360Links
The AI Steerability 360 toolkit is an extensible library for general purpose steering of LLMs.
☆76Updated 3 weeks ago
Alternatives and similar repositories for AISteer360
Users that are interested in AISteer360 are comparing it to the libraries listed below
Sorting:
- In-Context Explainability 360 toolkit☆65Updated 3 weeks ago
- MetaQuantus is an XAI performance tool to identify reliable evaluation metrics☆40Updated last year
- Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]☆219Updated 7 months ago
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆94Updated 2 months ago
- A toolkit for quantitative evaluation of data attribution methods.☆55Updated 6 months ago
- OpenXAI : Towards a Transparent Evaluation of Model Explanations☆252Updated last year
- Fairness toolkit for pytorch, scikit learn and autogluon☆33Updated 2 months ago
- AssetOpsBench - Industry 4.0☆956Updated this week
- Unified access to Large Language Model modules using NNsight☆88Updated this week
- 🪄 Interpreto is an interpretability toolbox for LLMs☆141Updated this week
- ☆34Updated last year
- Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations☆639Updated 3 weeks ago
- [NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.☆19Updated 2 weeks ago
- pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation☆142Updated 3 weeks ago
- Sparse Autoencoder for Mechanistic Interpretability☆291Updated last year
- Synthetic Data Generation for Foundation Models☆21Updated 3 months ago
- ☆146Updated last month
- ☆83Updated 11 months ago
- This repository collects all relevant resources about interpretability in LLMs☆391Updated last year
- SETOL: SemiEmpirical Theory of (Deep) Learning☆29Updated 6 months ago
- XAI-Bench is a library for benchmarking feature attribution explainability techniques☆71Updated 3 years ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆800Updated this week
- ☆432Updated last week
- Using sparse coding to find distributed representations used by neural networks.☆293Updated 2 years ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆25Updated 3 months ago
- Python package to compute interaction indices that extend the Shapley Value. AISTATS 2023.☆19Updated 2 years ago
- This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …☆121Updated 3 weeks ago
- Conformal Language Modeling☆31Updated 2 years ago
- ☆59Updated 2 months ago
- An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization☆140Updated 3 weeks ago