zepingyu0512 / llava-mechanismLinks

code for arxiv paper: Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering

☆0

Alternatives and similar repositories for llava-mechanism

Users that are interested in llava-mechanism are comparing it to the libraries listed below

Sorting:

luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆42Updated 9 months ago
jinzhuoran / RAG-RewardBench
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆16Updated 7 months ago
hkust-nlp / RL-Verifier-Pitfalls
Pitfalls of Rule- and Model-based Verifiers: A Case Study on Mathematical Reasoning.
☆22Updated 2 months ago
GAIR-NLP / weak-to-strong-reasoning
☆59Updated 11 months ago
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆58Updated 8 months ago
DynaMath / DynaMath
A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
☆24Updated 8 months ago
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 3 months ago
kaistAI / Volcano
[NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…
☆46Updated 11 months ago
zepingyu0512 / in-context-mechanism
code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…
☆13Updated 8 months ago
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆62Updated 8 months ago
MozerWang / DEMO
[ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
☆16Updated 7 months ago
hbin0701 / Self-Explore
[𝐄𝐌𝐍𝐋𝐏 𝐅𝐢𝐧𝐝𝐢𝐧𝐠𝐬 𝟐𝟎𝟐𝟒 & 𝐀𝐂𝐋 𝟐𝟎𝟐𝟒 𝐍𝐋𝐑𝐒𝐄 𝐎𝐫𝐚𝐥] 𝘌𝘯𝘩𝘢𝘯𝘤𝘪𝘯𝘨 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯…
☆51Updated last year
r-three / smear
☆30Updated last year
GAIR-NLP / self-improvement-reversal
☆13Updated last year
OpenMOSS / Say-I-Dont-Know
[ICML'2024] Can AI Assistants Know What They Don't Know?
☆81Updated last year
MikaStars39 / FeatureAlignment
FeatureAlignment = Alignment + Mechanistic Interpretability
☆29Updated 5 months ago
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆64Updated 3 weeks ago
GAIR-NLP / BeHonest
BeHonest: Benchmarking Honesty in Large Language Models
☆34Updated 11 months ago
LuoXiaoHeics / Continual-Tune
☆10Updated 6 months ago
HillZhang1999 / ICD
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"
☆68Updated last year
Yifan-Song793 / GoodBadGreedy
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
☆30Updated last year
iwangjian / Midi-Tuning
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue (ACL 2024)
☆23Updated last year
lezhang7 / MOQAGPT
[EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs
☆12Updated 7 months ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆79Updated 8 months ago
shawnricecake / Heima
Code for Heima
☆51Updated 3 months ago
RUCAIBox / RLMEC
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆38Updated last year
TobiasLee / VEC
Visual and Embodied Concepts evaluation benchmark
☆21Updated last year
HITsz-TMG / ICL-State-Vector
☆12Updated last year
chujiezheng / LLM-Extrapolation
Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"
☆75Updated 2 months ago