Hxyou / IdealGPTLinks

Official Code of IdealGPT

☆35

Alternatives and similar repositories for IdealGPT

Users that are interested in IdealGPT are comparing it to the libraries listed below

Sorting:

YiyangZhou / POVID
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆88Updated last year
YiyangZhou / LURE
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆150Updated last year
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆84Updated 9 months ago
YuxiXie / V-DPO
Preference Learning for LLaVA
☆54Updated last year
Yangyi-Chen / CoTConsistency
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
☆34Updated 2 years ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆83Updated last year
Yuqifan1117 / HalluciDoctor
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)
☆49Updated last year
haoyiq114 / VALOR
Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)
☆16Updated last year
vlf-silkie / VLFeedback
☆100Updated last year
PLUM-Lab / MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆133Updated 2 years ago
yihedeng9 / STIC
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆70Updated last year
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆55Updated last year
mlfoundations / VisIT-Bench
☆50Updated 2 years ago
YiyangZhou / CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆80Updated 3 weeks ago
bcdnlp / FAITHSCORE
FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models
☆30Updated 8 months ago
ChenDelong1999 / polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
☆64Updated last year
AoiDragon / POPE
[EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆96Updated 3 months ago
opendatalab / HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆98Updated last year
shizhediao / DaVinci
Source code for the paper "Prefix Language Models are Unified Modal Learners"
☆42Updated 2 years ago
edchengg / infoseek_eval
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆25Updated last year
TencentARC / GVT
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
☆58Updated 2 years ago
pkunlp-icler / MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆50Updated 4 months ago
yfzhang114 / LLaVA-Align
[ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…
☆82Updated 9 months ago
rohan598 / ConTextual
☆27Updated last year
thunlp / Muffin
☆66Updated last year
cambridgeltl / visual-spatial-reasoning
[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.
☆133Updated 2 years ago
Letian2003 / C-VQA
Counterfactual Reasoning VQA Dataset
☆25Updated last year
joez17 / VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆50Updated 8 months ago
palchenli / VL-Instruction-Tuning
☆91Updated last year
archiki / RepARe
☆20Updated 2 years ago