multimodal-interpretability / maia
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
β74Updated 6 months ago
Alternatives and similar repositories for maia:
Users that are interested in maia are comparing it to the libraries listed below
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.β106Updated last month
- β73Updated 6 months ago
- Implementation of π₯₯ Coconut, Chain of Continuous Thought, in Pytorchβ158Updated 2 months ago
- Holistic evaluation of multimodal foundation modelsβ42Updated 6 months ago
- Function Vectors in Large Language Models (ICLR 2024)β140Updated 4 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Modelsβ74Updated 5 months ago
- Official PyTorch Implementation for Task Vectors are Cross-Modalβ22Updated 2 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"β68Updated 3 months ago
- Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)β60Updated last month
- β28Updated last month
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"β48Updated 9 months ago
- β167Updated last year
- Official implementation of FIND (NeurIPS '23) Function Interpretation Benchmark and Automated Interpretability Agentsβ49Updated 5 months ago
- β95Updated 8 months ago
- β153Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ54Updated 6 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"β141Updated 11 months ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specificβ¦β65Updated 5 months ago
- β152Updated 3 weeks ago
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)β33Updated 3 months ago
- β39Updated 7 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"β81Updated last year
- The official implementation of Self-Exploring Language Models (SELM)β61Updated 9 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resourcesβ120Updated 3 weeks ago
- Matryoshka Multimodal Modelsβ97Updated last month
- Steering vectors for transformer language models in Pytorch / Huggingfaceβ88Updated last week
- β137Updated 9 months ago
- Language models scale reliably with over-training and on downstream tasksβ96Updated 11 months ago