llava-rlhf/LLaVA-RLHF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/llava-rlhf/LLaVA-RLHF)

llava-rlhf / LLaVA-RLHF

Aligning LMMs with Factually Augmented RLHF

☆397

Alternatives and similar repositories for LLaVA-RLHF

Users that are interested in LLaVA-RLHF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RLHF-V / RLHF-V
View on GitHub
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
☆310Sep 11, 2024Updated last year
FuxiaoLiu / LRV-Instruction
View on GitHub
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆297Mar 13, 2024Updated 2 years ago
RLHF-V / RLAIF-V
View on GitHub
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
☆458May 14, 2025Updated last year
FreedomIntelligence / ALLaVA
View on GitHub
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆281Jun 25, 2024Updated 2 years ago
vlf-silkie / VLFeedback
View on GitHub
☆102Dec 22, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
TideDra / VL-RLHF
View on GitHub
A RLHF Infrastructure for Vision-Language Models
☆201Nov 15, 2024Updated last year
YiyangZhou / LURE
View on GitHub
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆158Apr 30, 2024Updated 2 years ago
tianyi-lab / HallusionBench
View on GitHub
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…
☆342Oct 14, 2025Updated 9 months ago
yuweihao / MM-Vet
View on GitHub
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
☆331Jan 20, 2025Updated last year
RifleZhang / LLaVA-Hound-DPO
View on GitHub
☆158Oct 31, 2024Updated last year
shikiw / OPERA
View on GitHub
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…
☆411Aug 24, 2024Updated last year
PLUM-Lab / MultiInstruct
View on GitHub
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆135Jun 20, 2023Updated 3 years ago
LLaVA-VL / LLaVA-Plus-Codebase
View on GitHub
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
☆770Feb 1, 2024Updated 2 years ago
haoyiq114 / VALOR
View on GitHub
Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)
☆16Apr 23, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LLaVA-VL / LLaVA-NeXT
View on GitHub
☆4,713Jun 15, 2026Updated last month
RUCAIBox / POPE
View on GitHub
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆266Aug 21, 2025Updated 11 months ago
tsb0601 / MMVP
View on GitHub
☆365Jan 27, 2024Updated 2 years ago
BAAI-DCAI / Visual-Instruction-Tuning
View on GitHub
SVIT: Scaling up Visual Instruction Tuning
☆167Jun 20, 2024Updated 2 years ago
Kwai-YuanQi / MM-RLHF
View on GitHub
The Next Step Forward in Multimodal LLM Alignment
☆199May 1, 2025Updated last year
YiyangZhou / POVID
View on GitHub
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆94Apr 30, 2024Updated 2 years ago
X2FD / LVIS-INSTRUCT4V
View on GitHub
☆134Dec 22, 2023Updated 2 years ago
thunlp / Muffin
View on GitHub
☆65Feb 5, 2024Updated 2 years ago
SALT-NLP / LLaVAR
View on GitHub
Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
☆268Jun 12, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
AILab-CVC / SEED-Bench
View on GitHub
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
☆366Jan 14, 2025Updated last year
shikras / shikra
View on GitHub
☆814Jul 8, 2024Updated 2 years ago
Liuziyu77 / MMDU
View on GitHub
Official repository of MMDU dataset
☆108Sep 29, 2024Updated last year
mlfoundations / VisIT-Bench
View on GitHub
☆51Oct 29, 2023Updated 2 years ago
cambrian-mllm / cambrian
View on GitHub
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
☆2,009Nov 7, 2025Updated 8 months ago
JIA-Lab-research / LLaMA-VID
View on GitHub
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
☆861Jul 29, 2024Updated 2 years ago
yihedeng9 / STIC
View on GitHub
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆68May 31, 2024Updated 2 years ago
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,950Aug 12, 2024Updated last year
RunpeiDong / DreamLLM
View on GitHub
[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation
☆462Dec 2, 2024Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
VITA-MLLM / Woodpecker
View on GitHub
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
☆649Dec 23, 2024Updated last year
PKU-YuanGroup / MoE-LLaVA
View on GitHub
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
☆2,322Jul 15, 2025Updated last year
junyangwang0410 / AMBER
View on GitHub
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
☆173Jan 15, 2024Updated 2 years ago
WisconsinAIVision / ViP-LLaVA
View on GitHub
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
☆339Jul 17, 2024Updated 2 years ago
mlfoundations / open_flamingo
View on GitHub
An open-source framework for training large multimodal models.
☆4,116Aug 31, 2024Updated last year
EvolvingLMMs-Lab / open-r1-multimodal
View on GitHub
A fork to add multimodal model training to open-r1
☆1,594Feb 8, 2025Updated last year
IBM / SALMON
View on GitHub
Self-Alignment with Principle-Following Reward Models
☆170Sep 18, 2025Updated 10 months ago