itsqyh / Awesome-LMMs-Mechanistic-InterpretabilityView external linksLinks
A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.
☆183Oct 20, 2025Updated 3 months ago
Alternatives and similar repositories for Awesome-LMMs-Mechanistic-Interpretability
Users that are interested in Awesome-LMMs-Mechanistic-Interpretability are comparing it to the libraries listed below
Sorting:
- awesome SAE papers☆72May 24, 2025Updated 8 months ago
- ☆230Nov 22, 2024Updated last year
- Official implementation of Visco-Attack (EMNLP 2025 Main). We will progressively release the code and one-click reproduction scripts.☆28Aug 22, 2025Updated 5 months ago
- ☆79Nov 5, 2024Updated last year
- [ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…☆79Jun 6, 2024Updated last year
- Official implementation of "Interpreting and Controlling Vision Foundation Models via Text Explanations"☆14May 29, 2024Updated last year
- ☆13Apr 10, 2025Updated 10 months ago
- A very hacky set of functions for getting plotly to do what I want when doing mech interp research, designed to be compatible with PyTorc…☆11Jun 16, 2023Updated 2 years ago
- SFT+RL boosts multimodal reasoning☆46Jun 27, 2025Updated 7 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆54Jul 21, 2025Updated 6 months ago
- Latest Advances on Modality Priors in Multimodal Large Language Models☆30Dec 10, 2025Updated 2 months ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆30Oct 27, 2025Updated 3 months ago
- High-performance key-value store☆12Dec 31, 2018Updated 7 years ago
- ☆33Nov 14, 2025Updated 3 months ago
- ☆22Sep 16, 2025Updated 5 months ago
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆13Feb 13, 2023Updated 3 years ago
- [AAAI 2025 oral] Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit☆19Apr 19, 2025Updated 9 months ago
- ☆36Jun 13, 2025Updated 8 months ago
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆155Aug 14, 2025Updated 6 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆292Jan 22, 2026Updated 3 weeks ago
- The Github repo for our survey paper: "Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large…☆85Jan 30, 2026Updated 2 weeks ago
- ☆71Oct 1, 2025Updated 4 months ago
- A Benchmark Study on Machine Learning Methods for Fake News Detection☆16Jun 8, 2021Updated 4 years ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆182Sep 26, 2025Updated 4 months ago
- ☆44Jun 19, 2025Updated 7 months ago
- Documentation for EEE Cluster 02☆39Feb 5, 2026Updated last week
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆34Jan 20, 2026Updated 3 weeks ago
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆21Feb 23, 2025Updated 11 months ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆44Apr 21, 2024Updated last year
- 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).☆979Sep 27, 2025Updated 4 months ago
- ☆17Apr 14, 2021Updated 4 years ago
- A tiny paper rating web☆39Mar 19, 2025Updated 10 months ago
- Official PyTorch Implementation for the "What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-mod…☆19Sep 26, 2024Updated last year
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.☆84Jan 19, 2025Updated last year
- The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"☆20May 2, 2025Updated 9 months ago
- ☆71Jul 24, 2025Updated 6 months ago
- ☆57Nov 26, 2024Updated last year
- ☆23Jun 13, 2024Updated last year