opendatalab/VIGC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/opendatalab/VIGC)

opendatalab / VIGC

AAAI 2024: Visual Instruction Generation and Correction

☆97

Alternatives and similar repositories for VIGC

Users that are interested in VIGC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

opendatalab / dsdl-docs
View on GitHub
Data Set Description Language Specification （新一代人工智能数据集描述语言DSDL）
☆46May 29, 2024Updated 2 years ago
opendatalab / MLLM-DataEngine
View on GitHub
MLLM-DataEngine: An Iterative Refinement Approach for MLLM
☆49May 24, 2024Updated 2 years ago
opendatalab / opendatalab-datasets
View on GitHub
datasets resource
☆150May 27, 2026Updated last month
opendatalab / labelU-Kit
View on GitHub
Data annotation component library --provided as NPM packages
☆158Updated this week
opendatalab / labelbee
View on GitHub
☆25Nov 7, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
opendatalab / laion5b-downloader
View on GitHub
☆122Jan 15, 2026Updated 6 months ago
opendatalab / CLIP-Parrot-Bias
View on GitHub
ECCV2024_Parrot Captions Teach CLIP to Spot Text
☆66Sep 6, 2024Updated last year
opendatalab / WanJuan1.0
View on GitHub
万卷1.0多模态语料
☆574Oct 20, 2023Updated 2 years ago
conghui / replaycode
View on GitHub
ReplayCode — first open-source rebuild of Claude Code that actually runs. Built from decompiled source with Node.js/esbuild
☆20Apr 1, 2026Updated 3 months ago
opendatalab / WanJuan2.0-WanJuan-CC
View on GitHub
WanJuan-CC是以CommonCrawl为基础，经过数据抽取，规则清洗，去重，安全过滤，质量清洗等步骤得到的高质量数据。
☆14Apr 18, 2024Updated 2 years ago
opendatalab / HA-DPO
View on GitHub
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆104Jan 30, 2024Updated 2 years ago
opendatalab / labelU
View on GitHub
Open-source multimodal data annotation platform with AI auto-annotation support.
☆1,633Jul 16, 2026Updated last week
opendatalab / Meta-rater
View on GitHub
[ACL 2025 Best Theme Paper] This is the official implementation for the paper: "Meta-rater: A Multi-dimensional Data Selection Method for…
☆196Aug 29, 2025Updated 10 months ago
V3Det / V3Det
View on GitHub
☆121Jun 11, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Yuqifan1117 / HalluciDoctor
View on GitHub
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)
☆52Jul 16, 2024Updated 2 years ago
opendatalab / LabelLLM
View on GitHub
The Open-Source Data Annotation Platform
☆1,262Jul 2, 2026Updated 3 weeks ago
shikiw / OPERA
View on GitHub
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…
☆411Aug 24, 2024Updated last year
opendatalab / CHARM
View on GitHub
[ACL 2024 Main Conference] Chinese commonsense benchmark for LLMs
☆46Jul 27, 2024Updated last year
Relaxed-System-Lab / multi-actor-data-selection
View on GitHub
This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.
☆49Aug 22, 2025Updated 11 months ago
YiyangZhou / LURE
View on GitHub
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆158Apr 30, 2024Updated 2 years ago
RUCAIBox / POPE
View on GitHub
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆265Aug 21, 2025Updated 11 months ago
FuxiaoLiu / LRV-Instruction
View on GitHub
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆297Mar 13, 2024Updated 2 years ago
opendatalab / UniMERNet
View on GitHub
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
☆492Sep 28, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tianyi-lab / HallusionBench
View on GitHub
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…
☆342Oct 14, 2025Updated 9 months ago
opendatalab / OHR-Bench
View on GitHub
(ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
☆104Dec 3, 2025Updated 7 months ago
HYPJUDY / Sparkles
View on GitHub
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆45Jun 14, 2024Updated 2 years ago
huchao-AI / APN
View on GitHub
Normal Learning in Videos with Attention Prototype Network
☆18Jan 19, 2023Updated 3 years ago
MMStar-Benchmark / MMStar
View on GitHub
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
☆215Sep 26, 2024Updated last year
FreedomIntelligence / MLLM-Bench
View on GitHub
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆77Oct 16, 2024Updated last year
SaaRaaS-1300 / InternLM2_horowag
View on GitHub
🍏专门为 2024 书生·浦语大模型挑战赛 (春季赛) 准备的 Repo🍎收录了赫萝相关的微调源码
☆12Sep 20, 2024Updated last year
FudanNLPLAB / MouSi
View on GitHub
☆75Mar 7, 2024Updated 2 years ago
kyegomez / PaLM2-VAdapter
View on GitHub
Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…
☆17Nov 11, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
ZichenWen1 / DIJA
View on GitHub
(ICLR 2026 🔥) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"
☆79Feb 9, 2026Updated 5 months ago
baaivision / CapsFusion
View on GitHub
[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale
☆215Feb 27, 2024Updated 2 years ago
mlpc-ucsd / BLIVA
View on GitHub
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
☆261Apr 14, 2024Updated 2 years ago
palchenli / VL-Instruction-Tuning
View on GitHub
☆90Nov 25, 2023Updated 2 years ago
opendatalab / WanJuan3.0
View on GitHub
WanJuan3.0（“万卷·丝路”）一个作为综合性的纯文本语料库，采集了多个国家地区的网络公开信息、文献、专利等资料，数据总规模超1.2TB，Token总数超过300B，处于国际领先水平，首期开源的语料库主要由泰语、俄语、阿拉伯语、韩语和越南语5个子集构成，每个子集的数据…
☆47Feb 13, 2025Updated last year
guyyariv / LaMI
View on GitHub
[ACL 2026 Oral] Official implementation of LaMI: Augmenting Large Language Models via Late Multi-Image Fusion
☆19Jul 4, 2026Updated 2 weeks ago
ImKeTT / ReSee
View on GitHub
[EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation
☆12Dec 4, 2023Updated 2 years ago