opendatalab/MLLM-DataEngine

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/opendatalab/MLLM-DataEngine)

opendatalab / MLLM-DataEngine

MLLM-DataEngine: An Iterative Refinement Approach for MLLM

☆49

Alternatives and similar repositories for MLLM-DataEngine

Users that are interested in MLLM-DataEngine are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

opendatalab / VIGC
View on GitHub
AAAI 2024: Visual Instruction Generation and Correction
☆97Feb 4, 2024Updated 2 years ago
opendatalab / CLIP-Parrot-Bias
View on GitHub
ECCV2024_Parrot Captions Teach CLIP to Spot Text
☆66Sep 6, 2024Updated last year
JiuTian-VL / JiuTian-LION
View on GitHub
[CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
☆154Sep 3, 2025Updated 10 months ago
opendatalab / opendatalab-python-sdk
View on GitHub
SDK of OpenDataLab - https://opendatalab.org.cn
☆60Jul 31, 2025Updated 11 months ago
huangyangyu / FreeEnricher
View on GitHub
FreeEnricher: Enriching Face Landmarks without Additional Cost [Official, AAAI 2023]
☆18Dec 2, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Xiaomeng-Yang / STR_benchmark_cleansed
View on GitHub
☆14May 26, 2023Updated 3 years ago
opendatalab / labelbee
View on GitHub
☆25Nov 7, 2022Updated 3 years ago
yule-BUAA / MergeLLM
View on GitHub
Codes for Merging Large Language Models
☆37Aug 7, 2024Updated last year
opendatalab / opendatalab-datasets
View on GitHub
datasets resource
☆150May 27, 2026Updated last month
luogen1996 / LLaVA-HR
View on GitHub
[ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant
☆249Aug 14, 2024Updated last year
NIneeeeeem / LangDC
View on GitHub
[EMNLP 2025 Oral] Official codebase for Seeing More, Saying More: Lightweight Language Experts are Dynamic Video Token Compressors.
☆18Sep 7, 2025Updated 10 months ago
LivingSkyTechnologies / Dense_Article_Dataset_DAD
View on GitHub
Dense Article Dataset (DAD): A Benchmark Dataset for Document Layout Analysis
☆16Jan 13, 2022Updated 4 years ago
V3Det / mmdetection-V3Det
View on GitHub
OpenMMLab Detection Toolbox and Benchmark for V3Det
☆15Apr 3, 2024Updated 2 years ago
Leomingyangli / SemanticSLAM
View on GitHub
☆16Oct 5, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
foundation-multimodal-models / CAL
View on GitHub
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
☆58Sep 26, 2024Updated last year
JulioZhao97 / EffTrans_Fsdet
View on GitHub
This is a repository for ACMMM22 paper "Exploring Effective Knowledge Transfer for Few-shot Object Detection"
☆18Jun 21, 2023Updated 3 years ago
FreedomIntelligence / ALLaVA
View on GitHub
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆281Jun 25, 2024Updated 2 years ago
WarlockWendell / AggDet
View on GitHub
official implementation of Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation
☆13Apr 15, 2024Updated 2 years ago
FuxiaoLiu / LRV-Instruction
View on GitHub
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆297Mar 13, 2024Updated 2 years ago
wenhe-jia / TIVE
View on GitHub
☆11Jan 18, 2024Updated 2 years ago
OpenGVLab / all-seeing
View on GitHub
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of …
☆508Aug 9, 2024Updated last year
kleimerTU / HumanCentricLayouts
View on GitHub
☆19Jan 1, 2023Updated 3 years ago
opendatalab / OHR-Bench
View on GitHub
(ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
☆104Dec 3, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
BotPlayers / BotPlayers
View on GitHub
Play with agents and more.
☆22Sep 18, 2023Updated 2 years ago
chen-xin-94 / DART
View on GitHub
☆23Jul 9, 2026Updated last week
Charles-Xie / awesome-described-object-detection
View on GitHub
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring E…
☆358Nov 6, 2025Updated 8 months ago
hlz0606 / SSPCM
View on GitHub
CVPR2023
☆18Mar 18, 2023Updated 3 years ago
locuslab / scaling_laws_data_filtering
View on GitHub
☆64Apr 9, 2024Updated 2 years ago
shikiw / OPERA
View on GitHub
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…
☆411Aug 24, 2024Updated last year
hanhung / TGNN
View on GitHub
☆26Mar 15, 2022Updated 4 years ago
ttgeng233 / LongVALE
View on GitHub
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos. (CVPR 2025))
☆61Jun 9, 2025Updated last year
shanface33 / GPT4MF_UB
View on GitHub
Official repository of the paper: Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics
☆15Mar 22, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
panzhang0212 / CoCosNet
View on GitHub
☆11Jun 20, 2020Updated 6 years ago
mshukor / ima-lmms
View on GitHub
[NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
☆23Oct 15, 2024Updated last year
McGill-NLP / AURORA
View on GitHub
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
☆35Jun 30, 2025Updated last year
wux-labs / OpenXLab-IntelligentSalesAssistant
View on GitHub
☆19Jun 21, 2024Updated 2 years ago
Wang-Xiaodong1899 / Open-R1-Video
View on GitHub
✨First Open-Source R1-like Video-LLM [2025/02/18]
☆382Jul 1, 2026Updated 3 weeks ago
SYuan03 / MM-IFEngine
View on GitHub
[ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following
☆126Feb 13, 2026Updated 5 months ago
andruekonst / egbm-gam
View on GitHub
A straightforward implementation of EGBM-based Generalized Additive Model
☆13Oct 15, 2020Updated 5 years ago