A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.
☆196Mar 4, 2026Updated 3 weeks ago
Alternatives and similar repositories for Awesome-LMMs-Mechanistic-Interpretability
Users that are interested in Awesome-LMMs-Mechanistic-Interpretability are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- awesome SAE papers☆73May 24, 2025Updated 10 months ago
- ☆238Nov 22, 2024Updated last year
- Official implementation of Visco-Attack (EMNLP 2025 Main). We will progressively release the code and one-click reproduction scripts.☆30Aug 22, 2025Updated 7 months ago
- awesome papers in LLM interpretability☆612Aug 20, 2025Updated 7 months ago
- ☆35Jun 13, 2025Updated 9 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆160Aug 14, 2025Updated 7 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆58Jul 21, 2025Updated 8 months ago
- ScalingOpt - Optimization Community☆83Mar 22, 2026Updated last week
- [ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…☆80Jun 6, 2024Updated last year
- ☆83Nov 5, 2024Updated last year
- [AAAI 2025 oral] Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit☆19Apr 19, 2025Updated 11 months ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆102Nov 30, 2025Updated 4 months ago
- SFT+RL boosts multimodal reasoning☆47Jun 27, 2025Updated 9 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆62Mar 30, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Ranking-Consistent Language-Image Pretraining☆12Oct 24, 2025Updated 5 months ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆45Apr 21, 2024Updated last year
- ☆37Nov 14, 2025Updated 4 months ago
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆52Nov 17, 2024Updated last year
- Latest Advances on Modality Priors in Multimodal Large Language Models☆31Dec 10, 2025Updated 3 months ago
- ☆16Apr 14, 2021Updated 4 years ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆295Jan 22, 2026Updated 2 months ago
- ☆13Apr 10, 2025Updated 11 months ago
- A tiny paper rating web☆40Mar 19, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).☆59Apr 2, 2025Updated 11 months ago
- ☆45Jun 19, 2025Updated 9 months ago
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆13Feb 13, 2023Updated 3 years ago
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆21Feb 23, 2025Updated last year
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images☆68Jan 23, 2026Updated 2 months ago
- Official PyTorch Implementation for the "What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-mod…☆20Sep 26, 2024Updated last year
- [NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts☆38Sep 26, 2024Updated last year
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆38Jan 20, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The Github repo for our survey paper: "Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large…☆100Jan 30, 2026Updated 2 months ago
- The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"☆19May 2, 2025Updated 10 months ago
- Code for ICLR 2025 Failures to Find Transferable Image Jailbreaks Between Vision-Language Models☆36Jun 1, 2025Updated 9 months ago
- Accepted by IJCAI-24 Survey Track☆229Aug 25, 2024Updated last year
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆192Sep 26, 2025Updated 6 months ago
- Documentation for EEE Cluster 02☆43Mar 5, 2026Updated 3 weeks ago
- 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).☆996Sep 27, 2025Updated 6 months ago