Quinn777 / Previous-AtomThink

☆54

Alternatives and similar repositories for Previous-AtomThink:

Users that are interested in Previous-AtomThink are comparing it to the libraries listed below

LightChen233 / M3CoT
☆64Updated 9 months ago
njucckevin / MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
☆69Updated last month
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆95Updated 2 weeks ago
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆86Updated 5 months ago
TideDra / VL-RLHF
A RLHF Infrastructure for Vision-Language Models
☆167Updated 3 months ago
RifleZhang / LLaVA-Reasoner-DPO
☆66Updated 2 months ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆69Updated 4 months ago
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆114Updated 3 months ago
findalexli / mllm-dpo
[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model
☆34Updated 4 months ago
chancharikmitra / CCoT
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
☆112Updated 8 months ago
mrwu-mac / ControlMLLM
[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
☆148Updated last month
yuecao0119 / MMInstruct
[SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…
☆47Updated 4 months ago
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆43Updated 4 months ago
InfiMM / Awesome-Multimodal-LLM-for-Math-STEM
Paper collections of multi-modal LLM for Math/STEM/Code.
☆80Updated 3 weeks ago
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆52Updated last month
opendatalab / HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆82Updated last year
HZQ950419 / Math-LLaVA
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
☆78Updated 8 months ago
mathllm / MATH-V
[NeurIPS DB Track, 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.
☆90Updated this week
vlf-silkie / VLFeedback
☆94Updated last year
FudanDISC / ReForm-Eval
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
☆45Updated last year
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆72Updated last month
si0wang / VisVM
☆38Updated 2 months ago
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆48Updated 8 months ago
OpenGVLab / MMT-Bench
ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
☆101Updated 7 months ago
FudanNLPLAB / MouSi
☆73Updated last year
foundation-multimodal-models / CAL
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
☆57Updated 5 months ago
BAAI-DCAI / DataOptim
A collection of visual instruction tuning datasets.
☆76Updated last year