showlab/DIM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/showlab/DIM)

showlab / DIM

[ICLR 2026] Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing

☆28

Alternatives and similar repositories for DIM

Users that are interested in DIM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ExplainableML / LCS
View on GitHub
[ICML 2026] The Latent Color Subspace: Emergent Order in High-Dimensional Chaos
☆26Jun 9, 2026Updated last month
showlab / Show-Anything-3D
View on GitHub
Edit and Generate Anything in 3D world!
☆13Apr 15, 2023Updated 3 years ago
hithqd / ReasonBrain
View on GitHub
【ICML2026】Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning
☆27May 18, 2026Updated 2 months ago
zhentao-zou / MURE
View on GitHub
Beyond Textual CoT: Interleaved Text-image chains with Deep Confidence Reasoning for Image Editing
☆19Jun 24, 2026Updated 3 weeks ago
wendell0218 / Janus-Pro-R1
View on GitHub
[NeurIPS 2025] Official repository of the paper "Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Compreh…
☆23Sep 27, 2025Updated 9 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
VidCapBench / VidCapBench
View on GitHub
☆13May 17, 2025Updated last year
VectorSpaceLab / EditScore
View on GitHub
[ICLR 2026] EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling
☆254Mar 20, 2026Updated 4 months ago
showlab / Edit2Perceive
View on GitHub
[CVPR 2026] Official Implementation of Edit2Perceive
☆47Feb 21, 2026Updated 5 months ago
showlab / EVOLVE-VLA
View on GitHub
EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models
☆87Dec 17, 2025Updated 7 months ago
showlab / Adv-GRPO
View on GitHub
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image…
☆88Feb 26, 2026Updated 4 months ago
LINs-lab / APEX
View on GitHub
[Preprint] Self-Adversarial One Step Generation via Condition Shifting
☆55Apr 15, 2026Updated 3 months ago
showlab / FQGAN
View on GitHub
FQGAN: Factorized Visual Tokenization and Generation
☆59Mar 29, 2025Updated last year
ULMEvalKit / ULMEvalKit
View on GitHub
ULMEvalKit: One-Stop Eval ToolKit for Image Generation
☆56Dec 17, 2025Updated 7 months ago
MZ-MiaoZhang / DRLK
View on GitHub
☆12Oct 14, 2022Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
hobart07 / Step1X-Edit_train
View on GitHub
☆14May 20, 2025Updated last year
vvvvvjdy / SRA
View on GitHub
[ICLR 2026] Self-Representation Alignment for Diffusion Transformers (SRA)
☆144Jul 3, 2026Updated 2 weeks ago
CSU-JPG / TextAtlas
View on GitHub
[ICML 2026]A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation
☆93Sep 27, 2025Updated 9 months ago
percent4 / llama-2-multiple-choice-mrc
View on GitHub
本项目采用Firefly模型训练框架，使用LLAMA-2模型对多项选择阅读理解任务（Multiple Choice MRC）进行微调，取得了显著的进步。
☆11Sep 16, 2023Updated 2 years ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 5 months ago
xypeng9903 / LDF-VFI
View on GitHub
[CVPR 2026] Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers
☆46May 3, 2026Updated 2 months ago
weichow23 / EditMGT
View on GitHub
Official Repo for Paper <EditMGT Unleashing the Potential of Masked Generative Transformer in Image Editing>
☆79Dec 20, 2025Updated 7 months ago
xiechenxi99 / DNAEdit_code
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation for DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing
☆32Jan 23, 2026Updated 5 months ago
showlab / DoraCycle
View on GitHub
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
☆31Mar 8, 2026Updated 4 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
zhuangshaobin / WeTok
View on GitHub
[ICLR2026] WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction
☆69Sep 3, 2025Updated 10 months ago
NJU-PCALab / InstanceCap
View on GitHub
[CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍
☆45Jul 5, 2025Updated last year
showlab / DiffSim
View on GitHub
[ICCV 2025] Official repository of DiffSim: Taming Diffusion Models for Evaluating Visual Similarity
☆31Jul 14, 2025Updated last year
FingerRec / OA-Transformer
View on GitHub
[CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》
☆61May 25, 2022Updated 4 years ago
google-deepmind / objaverse_annotations
View on GitHub
☆15Dec 16, 2023Updated 2 years ago
showlab / MakeAnything
View on GitHub
Official code of "MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation"
☆211Apr 1, 2025Updated last year
showlab / Impossible-Videos
View on GitHub
ICML 2025 - Impossible Videos
☆81Jul 23, 2025Updated 11 months ago
showlab / cosmo
View on GitHub
☆75May 10, 2024Updated 2 years ago
fal-ai-community / alphabet-dataset
View on GitHub
Synthetic Alphabet Dataset
☆19Mar 27, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
HKUST-LongGroup / Coarse-guided-Gen
View on GitHub
[arXiv 2026] Official PyTorch Repository for "Coarse-Guided Visual Generation via Weighted h-Transform Sampling"
☆42May 8, 2026Updated 2 months ago
TingtingLiao / ARCH
View on GitHub
☆16Apr 29, 2022Updated 4 years ago
showlab / VisInContext
View on GitHub
Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
☆28Oct 30, 2024Updated last year
xiangyu-mm / UniFashion
View on GitHub
The official code for paper "UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation"
☆37Jul 29, 2024Updated last year
Guohanzhong / OSA-LCM
View on GitHub
☆25Dec 19, 2024Updated last year
showlab / WorldGUI
View on GitHub
Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.
☆124Jul 27, 2025Updated 11 months ago
Pepper-lll / LMforImageGeneration
View on GitHub
Codebase for the paper-Elucidating the design space of language models for image generation
☆45Nov 17, 2024Updated last year