AAAI 2024: Visual Instruction Generation and Correction
☆96Feb 4, 2024Updated 2 years ago
Alternatives and similar repositories for VIGC
Users that are interested in VIGC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆46May 29, 2024Updated last year
- SDK of OpenDataLab - https://opendatalab.org.cn☆59Jul 31, 2025Updated 9 months ago
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆48May 24, 2024Updated last year
- datasets resource☆138Apr 14, 2026Updated 2 weeks ago
- Data annotation component library --provided as NPM packages☆149Apr 21, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆121Jan 15, 2026Updated 3 months ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Sep 6, 2024Updated last year
- 万卷1.0多模态语料☆572Oct 20, 2023Updated 2 years ago
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆13Apr 18, 2024Updated 2 years ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆103Jan 30, 2024Updated 2 years ago
- The Open-Source Data Annotation Platform☆1,218Feb 19, 2025Updated last year
- ☆121Jun 11, 2024Updated last year
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆52Jul 16, 2024Updated last year
- This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.☆47Aug 22, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…☆406Aug 24, 2024Updated last year
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆469Sep 28, 2025Updated 7 months ago
- [ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆78Oct 22, 2025Updated 6 months ago
- WanJuan3.0(“万卷·丝路”)一个作为综合性的纯文本语料库,采集了多个国家地区的网络公开信息、文献、专利等资料,数据总规模超1.2TB,Token总数超过300B,处于国际领先水平,首期开源的语料库主要由泰语、俄语、阿拉伯语、韩语和越南语5个子集构成,每个子集的数据…☆46Feb 13, 2025Updated last year
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆156Apr 30, 2024Updated 2 years ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆68May 31, 2024Updated last year
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated 2 years ago
- PICABench: How Far Are We from Physically Realistic Image Editing?☆36Nov 5, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Normal Learning in Videos with Attention Prototype Network☆18Jan 19, 2023Updated 3 years ago
- The official pytorch implementation of Exploring the User Guidance for More Accurate Building Segmentation from High-Resolution Remote Se…☆18May 27, 2024Updated last year
- (ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆97Dec 3, 2025Updated 5 months ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- List of papers on Hallucination in LMM☆10Nov 29, 2023Updated 2 years ago
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆76Oct 16, 2024Updated last year
- Official PyTorch implementation of LaMI: Augmenting Large Language Models via Late Multi-Image Fusion (ACL 2026)☆17Apr 14, 2026Updated 2 weeks ago
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"☆207Sep 26, 2024Updated last year
- ☆75Mar 7, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ACL 2024 Main Conference] Chinese commonsense benchmark for LLMs☆45Jul 27, 2024Updated last year
- ✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models☆647Dec 23, 2024Updated last year
- 🍏专门为 2024 书生·浦语大模型挑战赛 (春季赛) 准备的 Repo🍎收录了赫萝相关的微调源码☆12Sep 20, 2024Updated last year
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆214Feb 27, 2024Updated 2 years ago
- [ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vis…☆25Jul 21, 2024Updated last year
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆260Apr 14, 2024Updated 2 years ago