This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
☆122Jun 18, 2025Updated 11 months ago
Alternatives and similar repositories for Vision-LLM-Alignment
Users that are interested in Vision-LLM-Alignment are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).☆21Jul 2, 2024Updated last year
- Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation☆29Jun 30, 2025Updated 11 months ago
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Nov 10, 2024Updated last year
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- A RLHF Infrastructure for Vision-Language Models☆200Nov 15, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A repository used to organize content related to Large Speech(Audio) Model, including paper, data, applications, tools and so on.☆28Nov 8, 2025Updated 7 months ago
- A list of conferences and journals relevant to machine translation☆33Mar 17, 2022Updated 4 years ago
- Building a inclusive, scalable, and high-performance multilingual translation model☆127May 7, 2026Updated last month
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆91Nov 13, 2024Updated last year
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆63Aug 23, 2024Updated last year
- A tool for translating the content of LaTeX documents into various other natural languages (e.g., translating an arXiv paper from English…☆472May 6, 2026Updated last month
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆53Oct 19, 2024Updated last year
- [MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"☆14Nov 1, 2024Updated last year
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆93Apr 30, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Multilingual Translations of "Foundations of Large Language Models" and NLPBook.☆261Sep 18, 2025Updated 8 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆86Nov 10, 2024Updated last year
- We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).☆87Jun 2, 2021Updated 5 years ago
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆22Jan 11, 2026Updated 5 months ago
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆1,421May 11, 2026Updated last month
- A benchmark for testing memorization abilities of LMs☆24Oct 15, 2024Updated last year
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆68May 31, 2024Updated 2 years ago
- ✨First Open-Source R1-like Video-LLM [2025/02/18]☆383Feb 23, 2025Updated last year
- [ACL 2025 🔥] Rethinking Step-by-step Visual Reasoning in LLMs☆309May 21, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- Explore the Multimodal “Aha Moment” on 2B Model☆623Mar 18, 2025Updated last year
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆87Oct 26, 2025Updated 7 months ago
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆12Aug 11, 2025Updated 10 months ago
- ☆48Dec 30, 2024Updated last year
- [BMVC 2024] On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models☆15Nov 1, 2024Updated last year
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Jul 1, 2024Updated last year
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆104Jan 30, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation, ECCV 2024☆22Feb 15, 2024Updated 2 years ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆15Jun 21, 2024Updated last year
- [CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆218Sep 26, 2025Updated 8 months ago
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback☆308Sep 11, 2024Updated last year
- Aligning LMMs with Factually Augmented RLHF☆396Nov 1, 2023Updated 2 years ago
- VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos☆23May 7, 2026Updated last month
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆24Sep 9, 2024Updated last year