AAAI 2024: Visual Instruction Generation and Correction
☆96Feb 4, 2024Updated 2 years ago
Alternatives and similar repositories for VIGC
Users that are interested in VIGC are comparing it to the libraries listed below
Sorting:
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆46May 29, 2024Updated last year
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆48May 24, 2024Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Sep 6, 2024Updated last year
- datasets resource☆131Jul 1, 2025Updated 8 months ago
- Data annotation component library --provided as NPM packages☆145Nov 19, 2025Updated 3 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆100Jan 30, 2024Updated 2 years ago
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆14Apr 18, 2024Updated last year
- ☆120Jan 15, 2026Updated last month
- ☆75Mar 7, 2024Updated last year
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆52Jul 16, 2024Updated last year
- [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…☆329Oct 14, 2025Updated 4 months ago
- Detectron2 Toolbox and Benchmark for V3Det☆18Jun 2, 2024Updated last year
- ☆120Jun 11, 2024Updated last year
- 万卷1.0多模态语料☆570Oct 20, 2023Updated 2 years ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆69May 31, 2024Updated last year
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated last year
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- [CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…☆397Aug 24, 2024Updated last year
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆213Feb 27, 2024Updated 2 years ago
- List of papers on Hallucination in LMM☆10Nov 29, 2023Updated 2 years ago
- Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"☆14Feb 21, 2024Updated 2 years ago
- official implementation of Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation☆13Apr 15, 2024Updated last year
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆260Apr 14, 2024Updated last year
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆72Oct 16, 2024Updated last year
- A straightforward implementation of EGBM-based Generalized Additive Model☆14Oct 15, 2020Updated 5 years ago
- The official repository of paper "ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection" (N…☆50Oct 23, 2023Updated 2 years ago
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- 🍏专门为 2024 书生·浦语 大模型挑战赛 (春季赛) 准备的 Repo🍎收录了赫萝相关的微调源码☆12Sep 20, 2024Updated last year
- ✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models☆650Dec 23, 2024Updated last year
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆86Jan 27, 2025Updated last year
- [ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆74Oct 22, 2025Updated 4 months ago
- Data annotation toolbox supports image, audio and video data.☆1,503Oct 1, 2025Updated 5 months ago
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆61May 2, 2025Updated 10 months ago
- Dataset introduced in PlotQA: Reasoning over Scientific Plots☆84Jun 20, 2023Updated 2 years ago
- Normal Learning in Videos with Attention Prototype Network☆18Jan 19, 2023Updated 3 years ago
- ☆13Jul 30, 2024Updated last year
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- ☆14Jan 9, 2026Updated last month
- ☆92Nov 25, 2023Updated 2 years ago