multimodal-interpretability / maia
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
☆80Updated 2 months ago
Alternatives and similar repositories for maia
Users that are interested in maia are comparing it to the libraries listed below
Sorting:
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆79Updated last month
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆128Updated 3 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆166Updated 3 weeks ago
- Holistic evaluation of multimodal foundation models☆47Updated 9 months ago
- ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).☆238Updated last week
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆76Updated 8 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆22Updated 2 weeks ago
- ☆41Updated 9 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆172Updated this week
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆74Updated 5 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆161Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 6 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆66Updated 11 months ago
- This repository is maintained to release dataset and models for multimodal puzzle reasoning.☆83Updated 2 months ago
- ☆93Updated 3 months ago
- Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)☆69Updated 2 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆51Updated 11 months ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆68Updated 3 months ago
- Matryoshka Multimodal Models☆106Updated 3 months ago
- Official implementation of FIND (NeurIPS '23) Function Interpretation Benchmark and Automated Interpretability Agents☆49Updated 7 months ago
- ☆53Updated 6 months ago
- ☆97Updated 10 months ago
- Sparse autoencoders for vision☆29Updated 2 weeks ago
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"☆39Updated 6 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆176Updated 8 months ago
- ☆177Updated last year
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆25Updated last year
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- ☆51Updated last month
- ☆31Updated 4 months ago