A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.
☆200Mar 4, 2026Updated 2 months ago
Alternatives and similar repositories for Awesome-LMMs-Mechanistic-Interpretability
Users that are interested in Awesome-LMMs-Mechanistic-Interpretability are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- awesome SAE papers☆77May 24, 2025Updated last year
- ☆249Nov 22, 2024Updated last year
- Official implementation of Visco-Attack (EMNLP 2025 Main). An open-source one-click reproduction script is also provided.☆30Apr 11, 2026Updated last month
- awesome papers in LLM interpretability☆620Aug 20, 2025Updated 9 months ago
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆171Aug 14, 2025Updated 9 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆62Jul 21, 2025Updated 10 months ago
- ☆22Sep 16, 2025Updated 8 months ago
- ScalingOpt - Optimization Community☆98Updated this week
- [ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…☆82Jun 6, 2024Updated last year
- [AAAI 2025 oral] Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit☆19Apr 19, 2025Updated last year
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆103Nov 30, 2025Updated 6 months ago
- SFT+RL boosts multimodal reasoning☆49Jun 27, 2025Updated 11 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆62Mar 30, 2024Updated 2 years ago
- Ranking-Consistent Language-Image Pretraining☆13Oct 24, 2025Updated 7 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆46Apr 21, 2024Updated 2 years ago
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆52Nov 17, 2024Updated last year
- This repository collects all relevant resources about interpretability in LLMs☆400Nov 1, 2024Updated last year
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆306Jan 22, 2026Updated 4 months ago
- ☆13Apr 10, 2025Updated last year
- A tiny paper rating web☆40Mar 19, 2025Updated last year
- ☆45Jun 19, 2025Updated 11 months ago
- ✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).☆58Apr 2, 2025Updated last year
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆14Feb 13, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆22Feb 23, 2025Updated last year
- Official PyTorch Implementation for the "What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-mod…☆20Sep 26, 2024Updated last year
- official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"☆234Jun 1, 2025Updated 11 months ago
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images☆70Jan 23, 2026Updated 4 months ago
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆39Jan 20, 2026Updated 4 months ago
- The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"☆19May 2, 2025Updated last year
- (NeurIPS 2025) Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation☆72May 21, 2026Updated last week
- Accepted by IJCAI-24 Survey Track☆233Aug 25, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Documentation for EEE Cluster 02☆49May 12, 2026Updated 2 weeks ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆198Sep 26, 2025Updated 8 months ago
- Official implementation of "Interpreting and Controlling Vision Foundation Models via Text Explanations"☆14May 29, 2024Updated 2 years ago
- 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).☆1,020Sep 27, 2025Updated 8 months ago
- 关于LLM和Multimodal LLM的paper list☆59May 12, 2026Updated 2 weeks ago
- A very hacky set of functions for getting plotly to do what I want when doing mech interp research, designed to be compatible with PyTorc…☆13Jun 16, 2023Updated 2 years ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year