NiuTrans/Vision-LLM-Alignment

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NiuTrans/Vision-LLM-Alignment)

NiuTrans / Vision-LLM-Alignment

This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.

☆122

Alternatives and similar repositories for Vision-LLM-Alignment

Users that are interested in Vision-LLM-Alignment are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wangclnlp / DeepSpeed-Chat-Extension
View on GitHub
This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).
☆21Jul 2, 2024Updated 2 years ago
NiuTrans / GRAM
View on GitHub
Code for ICML 2025 paper "GRAM: A Generative Foundation Reward Model for Reward Generalization"
☆21Sep 4, 2025Updated 10 months ago
NiuTrans / LaMaTE
View on GitHub
Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation
☆30Jun 30, 2025Updated last year
NiuTrans / ODEs-in-Vision-and-Language
View on GitHub
An introduction to ODEs and their applications in vision and language
☆15Feb 26, 2026Updated 4 months ago
findalexli / mllm-dpo
View on GitHub
[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model
☆48Nov 10, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NiuTrans / LMT
View on GitHub
Building a inclusive, scalable, and high-performance multilingual translation model
☆126May 7, 2026Updated 2 months ago
YuxiXie / V-DPO
View on GitHub
Preference Learning for LLaVA
☆60Nov 9, 2024Updated last year
TideDra / VL-RLHF
View on GitHub
A RLHF Infrastructure for Vision-Language Models
☆201Nov 15, 2024Updated last year
AI9Stars / LLM-Rubrics-Survey
View on GitHub
A survey of rubrics across the evolving LLM landscape.
☆65Jul 2, 2026Updated 3 weeks ago
hananshafi / MedContext
View on GitHub
[MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"
☆14Nov 1, 2024Updated last year
luka-group / vlm-knowledge-conflict
View on GitHub
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆54Oct 19, 2024Updated last year
pipilurj / bootstrapped-preference-optimization-BPO
View on GitHub
code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"
☆63Aug 23, 2024Updated last year
intervention-training / int
View on GitHub
☆16Feb 4, 2026Updated 5 months ago
YiyangZhou / POVID
View on GitHub
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆94Apr 30, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
luka-group / mDPO
View on GitHub
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆88Nov 10, 2024Updated last year
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
FreedomIntelligence / TRIM
View on GitHub
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆22Jan 11, 2026Updated 6 months ago
NiuTrans / LanguageCodes
View on GitHub
We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).
☆87Jun 2, 2021Updated 5 years ago
hananshafi / MTL-ViT
View on GitHub
A new multi-task learning framework using Vision Transformers
☆11Jun 19, 2024Updated 2 years ago
NiuTrans / ForgettingCurve
View on GitHub
A benchmark for testing memorization abilities of LMs
☆24Oct 15, 2024Updated last year
yihedeng9 / STIC
View on GitHub
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆68May 31, 2024Updated 2 years ago
NiuTrans / Introduction-to-Transformers
View on GitHub
An introduction to basic concepts of Transformers and key techniques of their recent advances.
☆53Dec 21, 2023Updated 2 years ago
Wang-Xiaodong1899 / Open-R1-Video
View on GitHub
✨First Open-Source R1-like Video-LLM [2025/02/18]
☆382Jul 1, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LuLuLuyi / TDAR
View on GitHub
Advancing Block Diffusion Language Models for Test-Time Scaling
☆16Feb 14, 2026Updated 5 months ago
mbzuai-oryx / LlamaV-o1
View on GitHub
[ACL 2025 🔥] Rethinking Step-by-step Visual Reasoning in LLMs
☆307May 21, 2025Updated last year
si0wang / VisVM
View on GitHub
☆46Dec 30, 2024Updated last year
fahadshamshad / deep-facial-privacy-prior
View on GitHub
[ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".
☆12Oct 11, 2024Updated last year
Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs
View on GitHub
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…
☆1,435May 11, 2026Updated 2 months ago
techmn / cosnet
View on GitHub
A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)
☆12Aug 11, 2025Updated 11 months ago
turningpoint-ai / VisualThinker-R1-Zero
View on GitHub
Explore the Multimodal “Aha Moment” on 2B Model
☆624Mar 18, 2025Updated last year
YiyangZhou / CSR
View on GitHub
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆87Oct 26, 2025Updated 8 months ago
HashmatShadab / Robustness-of-Volumetric-Medical-Segmentation-Models
View on GitHub
[BMVC 2024] On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models
☆15Nov 1, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
NiuTrans / ToFu
View on GitHub
Self-hosted AI assistant with tool use, multi-agent orchestration, coding copilot and a lightweight Flask + vanilla JS stack.
☆133Updated this week
gauss5930 / iDUS
View on GitHub
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆14Mar 20, 2024Updated 2 years ago
opendatalab / HA-DPO
View on GitHub
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆104Jan 30, 2024Updated 2 years ago
keven980716 / weak-to-strong-deception
View on GitHub
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
☆15Jun 21, 2024Updated 2 years ago
iabh1shekbasu / CalibPrompt
View on GitHub
[BMVC 2025 🔥] CalibPrompt is the first framework that enhances Med-VLM calibration during prompt tuning.
☆16Jul 13, 2026Updated last week
MeiGen-AI / PosterReward
View on GitHub
[CVPR2026] PosterReward: Unlocking Accurate Evaluation for High-Quality Graphic Design Generation
☆32Apr 2, 2026Updated 3 months ago
LengSicong / MMR1
View on GitHub
[CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
☆217Sep 26, 2025Updated 9 months ago