PLUM-Lab / MultiInstructLinks

MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning

☆134

Alternatives and similar repositories for MultiInstruct

Users that are interested in MultiInstruct are comparing it to the libraries listed below

Sorting:

ChenDelong1999 / polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
☆64Updated last year
TIGER-AI-Lab / UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆169Updated last year
FuxiaoLiu / LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆291Updated last year
YiyangZhou / LURE
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆150Updated last year
MichaelZhouwang / VLUE
This repo contains codes and instructions for baselines in the VLUE benchmark.
☆41Updated 3 years ago
open-vision-language / oven
☆40Updated 2 years ago
allenai / aokvqa
Official repository for the A-OKVQA dataset
☆104Updated last year
vlf-silkie / VLFeedback
☆100Updated last year
X-PLUG / mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
☆98Updated last year
bcdnlp / FAITHSCORE
FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models
☆31Updated last week
RUCAIBox / POPE
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆231Updated 3 months ago
edchengg / infoseek_eval
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆25Updated last year
thunlp / Muffin
☆66Updated last year
HYPJUDY / Sparkles
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆44Updated last year
Hxyou / IdealGPT
Official Code of IdealGPT
☆35Updated 2 years ago
xieyuquanxx / awesome-Large-MultiModal-Hallucination
😎 curated list of awesome LMM hallucinations papers, methods & resources.
☆150Updated last year
cambridgeltl / visual-spatial-reasoning
[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.
☆133Updated 2 years ago
FreedomIntelligence / MLLM-Bench
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆72Updated last year
FuxiaoLiu / MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
☆96Updated 10 months ago
microsoft / PICa
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)
☆86Updated 3 years ago
ylsung / VL_adapter
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)
☆207Updated 2 years ago
RifleZhang / LLaVA-Hound-DPO
☆155Updated last year
OpenMatch / UniVL-DR
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…
☆53Updated last year
LisaAnne / Hallucination
☆85Updated 6 years ago
limanling / KnowledgeVL-Reading
☆67Updated 2 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 3 years ago
YiyangZhou / POVID
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆89Updated last year
open-vision-language / infoseek
☆67Updated 2 years ago
X2FD / LVIS-INSTRUCT4V
☆133Updated last year
BAAI-DCAI / Visual-Instruction-Tuning
SVIT: Scaling up Visual Instruction Tuning
☆164Updated last year