A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.
☆205Mar 4, 2026Updated 3 months ago
Alternatives and similar repositories for Awesome-LMMs-Mechanistic-Interpretability
Users that are interested in Awesome-LMMs-Mechanistic-Interpretability are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- awesome SAE papers☆78May 24, 2025Updated last year
- ☆253Nov 22, 2024Updated last year
- Official implementation of Visco-Attack (EMNLP 2025 Main). An open-source one-click reproduction script is also provided.☆30Apr 11, 2026Updated 2 months ago
- awesome papers in LLM interpretability☆621Aug 20, 2025Updated 9 months ago
- ☆35Jun 13, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆172Aug 14, 2025Updated 10 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆62Jul 21, 2025Updated 10 months ago
- ☆22Sep 16, 2025Updated 9 months ago
- ScalingOpt - Optimization Community☆100Jun 1, 2026Updated 2 weeks ago
- ☆84Nov 5, 2024Updated last year
- [ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…☆86Jun 6, 2024Updated 2 years ago
- [AAAI 2025 oral] Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit☆19Apr 19, 2025Updated last year
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆103Nov 30, 2025Updated 6 months ago
- SFT+RL boosts multimodal reasoning☆50Jun 27, 2025Updated 11 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆62Mar 30, 2024Updated 2 years ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆46Apr 21, 2024Updated 2 years ago
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆52Nov 17, 2024Updated last year
- This repository collects all relevant resources about interpretability in LLMs☆401Nov 1, 2024Updated last year
- ☆16Apr 14, 2021Updated 5 years ago
- Latest Advances on Modality Priors in Multimodal Large Language Models☆31Dec 10, 2025Updated 6 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆307Jan 22, 2026Updated 4 months ago
- ☆13Apr 10, 2025Updated last year
- A tiny paper rating web☆40Mar 19, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).☆58Apr 2, 2025Updated last year
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆14Feb 13, 2023Updated 3 years ago
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆23Feb 23, 2025Updated last year
- ☆46Jun 19, 2025Updated 11 months ago
- official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"☆234Jun 1, 2025Updated last year
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images