EvolvingLMMs-Lab/multimodal-sae

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/EvolvingLMMs-Lab/multimodal-sae)

EvolvingLMMs-Lab / multimodal-sae

[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.

☆199

Alternatives and similar repositories for multimodal-sae

Users that are interested in multimodal-sae are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EvolvingLMMs-Lab / sae
View on GitHub
A framework that allows you to apply Sparse AutoEncoder on any models
☆53Jul 11, 2025Updated last year
nickjiang2378 / vlm-hallucinations
View on GitHub
[ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"
☆105Nov 30, 2025Updated 7 months ago
PKU-Alignment / SAE-V
View on GitHub
[ICML 2025 Poster] SAE-V: Interpreting Multimodal Models for Enhanced Alignment
☆17Jun 5, 2025Updated last year
EvolvingLMMs-Lab / OpenMMReasoner
View on GitHub
[CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
☆164Mar 30, 2026Updated 3 months ago
ExplainableML / sae-for-vlm
View on GitHub
[NeurIPS 2025] Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
☆89Jun 5, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
UniX-AI-Lab / WorldReasonBench
View on GitHub
WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors
☆22May 19, 2026Updated 2 months ago
EvolvingLMMs-Lab / OneVision-Encoder
View on GitHub
Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
☆386Jun 20, 2026Updated last month
clemneo / llava-interp
View on GitHub
☆86Nov 5, 2024Updated last year
EvolvingLMMs-Lab / engram
View on GitHub
Privacy-first AI memory layer - Signal for AI Memory. E2EE, local-first, works with Claude, Cursor, and any MCP-compatible AI.
☆23Jun 12, 2026Updated last month
EvolvingLMMs-Lab / LongVT
View on GitHub
[CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
☆255Jun 24, 2026Updated last month
EleutherAI / delphi
View on GitHub
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆268Updated this week
EvolvingLMMs-Lab / EgoLife
View on GitHub
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
☆450Mar 19, 2025Updated last year
Prisma-Multimodal / ViT-Prisma
View on GitHub
ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).
☆380Jul 23, 2025Updated last year
pufanyi / syphus
View on GitHub
Syphus: Automatic Instruction-Response Generation Pipeline
☆14Dec 14, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ssfgunner / VL-SAE
View on GitHub
[NeurIPS 2025] This is the official repository for VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Se…
☆15Oct 29, 2025Updated 8 months ago
EvolvingLMMs-Lab / lmms-eval
View on GitHub
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
☆4,331Updated this week
zhangbaijin / From-Redundancy-to-Relevance
View on GitHub
[NAACL 2025 Oral] From redundancy to relevance: Enhancing explainability in multimodal large language models
☆130Jan 30, 2026Updated 5 months ago
EvolvingLMMs-Lab / LongVA
View on GitHub
Long Context Transfer from Language to Vision
☆407Mar 18, 2025Updated last year
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
synvo-ai / local-cocoa
View on GitHub
A local AI assistant running on your device. It turns your files into actionable memory.
☆55Mar 24, 2026Updated 4 months ago
EvolvingLMMs-Lab / open-r1-multimodal
View on GitHub
A fork to add multimodal model training to open-r1
☆1,591Feb 8, 2025Updated last year
LALBJ / PAI
View on GitHub
[ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
☆171Nov 6, 2024Updated last year
mrwu-mac / R-Bench
View on GitHub
[ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'
☆24Jan 1, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
AtsuMiyai / UPD
View on GitHub
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
☆82Mar 6, 2026Updated 4 months ago
DAMO-NLP-SG / VCD
View on GitHub
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
☆410Oct 7, 2024Updated last year
Imageomics / saev
View on GitHub
Sparse autoencoders for vision
☆64Updated this week
EvolvingLMMs-Lab / Aero-1
View on GitHub
☆79May 4, 2025Updated last year
mshukor / xl-vlms
View on GitHub
XL-VLMs: General Repository for eXplainable Large Vision Language Models
☆52Sep 8, 2025Updated 10 months ago
jun297 / v1
View on GitHub
v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning
☆21Updated this week
adamkarvonen / SAEBench
View on GitHub
☆178May 1, 2026Updated 2 months ago
swei2001 / RouteSAEs
View on GitHub
☆15Jan 2, 2026Updated 6 months ago
EvolvingLMMs-Lab / ParaVT
View on GitHub
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
☆54Jun 2, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bronyayang / Law_of_Vision_Representation_in_MLLMs
View on GitHub
[COLM'25] Official implementation of the Law of Vision Representation in MLLMs
☆177Oct 6, 2025Updated 9 months ago
shengliu66 / VTI
View on GitHub
Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering
☆117Nov 23, 2024Updated last year
EvolvingLMMs-Lab / MGPO
View on GitHub
High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning
☆55Jul 23, 2025Updated last year
EvolvingLMMs-Lab / lmms-engine
View on GitHub
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
☆810Updated this week
Luodian / nano-hevc
View on GitHub
A minimal, educational HEVC (H.265) encoder written in Python.
☆53Feb 23, 2026Updated 5 months ago
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆151May 1, 2026Updated 2 months ago
penghao-wu / visual_jigsaw
View on GitHub
☆78Apr 9, 2026Updated 3 months ago