AIDC-AI / Wings
The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]
β13Updated 3 weeks ago
Alternatives and similar repositories for Wings:
Users that are interested in Wings are comparing it to the libraries listed below
- π The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.β35Updated 4 months ago
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability ofβ¦β109Updated last month
- Official repository of MMDU datasetβ82Updated 3 months ago
- β32Updated this week
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledgeβ128Updated 6 months ago
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMsβ91Updated 2 months ago
- A collection of visual instruction tuning datasets.β76Updated 10 months ago
- SVIT: Scaling up Visual Instruction Tuningβ164Updated 7 months ago
- Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"β173Updated 4 months ago
- γNeurIPS 2024γDense Connector for MLLMsβ154Updated 3 months ago
- The official implementation of RARβ79Updated 9 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'β137Updated last week
- A RLHF Infrastructure for Vision-Language Modelsβ145Updated 2 months ago
- β87Updated last year
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimizationβ77Updated 11 months ago
- β59Updated 11 months ago
- β94Updated last year
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignmentβ56Updated 3 months ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Modelβ252Updated 6 months ago
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedbackβ257Updated 4 months ago
- Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original cβ¦β28Updated 2 months ago
- β33Updated 6 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''β77Updated 9 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding stratβ¦β76Updated 9 months ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuningβ266Updated 10 months ago
- ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGIβ96Updated 6 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMsβ33Updated 2 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)β40Updated 2 months ago
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigatingβ87Updated 11 months ago
- The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity". Thβ¦β42Updated 2 months ago