IBM / AISteer360Links
The AI Steerability 360 toolkit is an extensible library for general purpose steering of LLMs.
☆76Updated 3 weeks ago
Alternatives and similar repositories for AISteer360
Users that are interested in AISteer360 are comparing it to the libraries listed below
Sorting:
- In-Context Explainability 360 toolkit☆65Updated 3 weeks ago
- OpenXAI : Towards a Transparent Evaluation of Model Explanations☆252Updated last year
- MetaQuantus is an XAI performance tool to identify reliable evaluation metrics☆40Updated last year
- Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]☆219Updated 7 months ago
- A toolkit for quantitative evaluation of data attribution methods.☆55Updated 6 months ago
- Fairness toolkit for pytorch, scikit learn and autogluon☆33Updated 2 months ago
- 🪄 Interpreto is an interpretability toolbox for LLMs☆141Updated this week
- This repository collects all relevant resources about interpretability in LLMs☆391Updated last year
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆94Updated 2 months ago
- [NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.☆19Updated 2 weeks ago
- Sparse Autoencoder for Mechanistic Interpretability☆290Updated last year
- TalkToModel gives anyone with the powers of XAI through natural language conversations 💬!☆126Updated 2 years ago
- pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation☆142Updated 3 weeks ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆63Updated last year
- ☆34Updated last year
- AssetOpsBench - Industry 4.0☆956Updated this week
- Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations☆639Updated 3 weeks ago
- Unified access to Large Language Model modules using NNsight☆88Updated this week
- ☆389Updated 5 months ago
- Mechanistic understanding and validation of large AI models with SemanticLens☆50Updated 2 months ago
- ☆143Updated last month
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆800Updated this week
- ☆83Updated 11 months ago
- Synthetic Data Generation for Foundation Models☆21Updated 3 months ago
- ControlArena is a collection of settings, model organisms and protocols - for running control experiments.☆153Updated this week
- SETOL: SemiEmpirical Theory of (Deep) Learning☆29Updated 6 months ago
- This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix …☆121Updated 3 weeks ago
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆857Updated last week
- A framework for interpreting modern AI systems using Monte Carlo Shapley value estimation. Model-agnostic explainability across language …☆72Updated 3 weeks ago
- Using sparse coding to find distributed representations used by neural networks.☆293Updated 2 years ago