hychaochao / EMMA

The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark"

☆50

Alternatives and similar repositories for EMMA:

Users that are interested in EMMA are comparing it to the libraries listed below

ShadeCloak / ADORA
☆41Updated last week
SUSTechBruce / LOOK-M
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆92Updated 5 months ago
njucckevin / MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
☆76Updated 2 months ago
hkust-nlp / mstar
M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆56Updated 3 months ago
Osilly / dynamic_llava
[ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…
☆29Updated 4 months ago
pipilurj / bootstrapped-preference-optimization-BPO
code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"
☆54Updated 7 months ago
zhijie-group / Orthus
☆21Updated 2 weeks ago
LightChen233 / M3CoT
☆71Updated 10 months ago
OpenSparseLLMs / LLaMA-MoE-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
☆80Updated 4 months ago
OpenSparseLLMs / MoM
☆74Updated last month
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆71Updated 5 months ago
lzhxmu / VTW
Code release for VTW (AAAI 2025) Oral
☆34Updated 3 months ago
RifleZhang / LLaVA-Reasoner-DPO
☆71Updated 3 months ago
OpenGVLab / V2PE
[ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding
☆42Updated 4 months ago
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆65Updated 2 months ago
YiyangZhou / CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆71Updated 10 months ago
MME-Benchmarks / MME-CoT
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆95Updated 3 weeks ago
ssmisya / PRMBench
The official code repository for PRMBench.
☆72Updated 2 months ago
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆36Updated 9 months ago
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆46Updated 5 months ago
IDEA-FinAI / ChartMoE
[ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
☆72Updated 2 weeks ago
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆53Updated 9 months ago
haonan3 / V1
V1: Toward Multimodal Reasoning by Designing Auxiliary Task
☆31Updated last week
1zhou-Wang / MemVR
Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …
☆47Updated last month
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆55Updated 2 months ago
Open-DataFlow / Awesome_MLLMs_Reasoning
☆89Updated last week
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆99Updated last month
UCSC-VLAA / VLAA-Thinking
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
☆84Updated this week
joez17 / VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆47Updated last month
DAMO-NLP-SG / CMM
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
☆45Updated 6 months ago