PLUM-Lab / MultiInstructLinks
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
β134Updated 2 years ago
Alternatives and similar repositories for MultiInstruct
Users that are interested in MultiInstruct are comparing it to the libraries listed below
Sorting:
- 𦩠Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)β64Updated last year
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)β169Updated last year
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuningβ291Updated last year
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Modelsβ150Updated last year
- This repo contains codes and instructions for baselines in the VLUE benchmark.β41Updated 3 years ago
- β40Updated 2 years ago
- Official repository for the A-OKVQA datasetβ104Updated last year
- β100Updated last year
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigatingβ98Updated last year
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Modelsβ31Updated last week
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''β231Updated 3 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questionsβ25Updated last year
- β66Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Modelsβ44Updated last year
- Official Code of IdealGPTβ35Updated 2 years ago
- π curated list of awesome LMM hallucinations papers, methods & resources.β150Updated last year
- [TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.β133Updated 2 years ago
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteriaβ72Updated last year
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuningβ96Updated 10 months ago
- An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)β86Updated 3 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)β207Updated 2 years ago
- β155Updated last year
- [ICLR 2023] This is the code repo for our ICLRβ23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spaβ¦β53Updated last year
- β85Updated 6 years ago
- β67Updated 2 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learnersβ115Updated 3 years ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuningβ89Updated last year
- β67Updated 2 years ago
- β133Updated last year
- SVIT: Scaling up Visual Instruction Tuningβ164Updated last year