zhenyuw16/GenArtist

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhenyuw16/GenArtist)

zhenyuw16 / GenArtist

Code release for our NeurIPS 2024 Spotlight paper "GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing"

☆169

Alternatives and similar repositories for GenArtist

Users that are interested in GenArtist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SHI-Labs / T2I-Copilot
View on GitHub
T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation (ICCV'25)
☆57Oct 6, 2025Updated 9 months ago
cuixing100876 / InstaStyle
View on GitHub
☆15Jul 24, 2024Updated 2 years ago
donahowe / TheaterGen
View on GitHub
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
☆69Sep 26, 2024Updated last year
Eureka-Maggie / MIGE
View on GitHub
Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing
☆72Jul 13, 2025Updated last year
AILab-CVC / SEED-X
View on GitHub
Multimodal Models in Real World
☆558Feb 24, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LingjieKong-fdu / CustAny
View on GitHub
Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)
☆47Apr 10, 2025Updated last year
rongyaofang / GoT
View on GitHub
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆317Sep 28, 2025Updated 10 months ago
jylei16 / Imagine-e
View on GitHub
☆14Jan 22, 2025Updated last year
PKU-YuanGroup / WISE
View on GitHub
[ICML 2026🔥] WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
☆212Jun 26, 2026Updated last month
AFeng-x / PixWizard
View on GitHub
[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from u…
☆211May 5, 2025Updated last year
Monalissaa / DisenDiff
View on GitHub
[CVPR`2024, Oral] Attention Calibration for Disentangled Text-to-Image Personalization
☆111Apr 10, 2024Updated 2 years ago
Karine-Huang / GenMAC
View on GitHub
[AAAI 2026] GenMAC for Compositional Text-to-Video Generation
☆35Jan 10, 2026Updated 6 months ago
wyhlovecpp / GPT-Image-Edit
View on GitHub
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
☆243Aug 15, 2025Updated 11 months ago
TencentARC / SmartEdit
View on GitHub
Official code of SmartEdit [CVPR-2024 Highlight]
☆374Jun 21, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
jacklishufan / Reflect-DiT
View on GitHub
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
☆56Aug 16, 2025Updated 11 months ago
YangLing0818 / RPG-DiffusionMaster
View on GitHub
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
☆1,840Feb 1, 2025Updated last year
PKU-YuanGroup / ImgEdit
View on GitHub
[NeurIPS 2025 D&B🔥] ImgEdit: A Unified Image Editing Dataset and Benchmark
☆330Nov 5, 2025Updated 8 months ago
gogoduan / GoT-R1
View on GitHub
[ICLR26] GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning
☆106Jan 27, 2026Updated 6 months ago
aim-uofa / MovieDreamer
View on GitHub
[ICLR'25] MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
☆323Aug 10, 2024Updated last year
Cominclip / BoxDiff-XL
View on GitHub
Extend BoxDiff to SDXL (SDXL-based layout-to-image generation)
☆28May 23, 2024Updated 2 years ago
ZiyuGuo99 / Image-Generation-CoT
View on GitHub
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
☆865Mar 19, 2026Updated 4 months ago
showlab / Show-o
View on GitHub
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
☆1,964Jan 8, 2026Updated 6 months ago
rongyaofang / prism-bench
View on GitHub
This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…
☆131Jan 29, 2026Updated 6 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
YangLing0818 / EditWorld
View on GitHub
[ACM Multimedia 2025 Datasets Track] EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
☆142Aug 2, 2025Updated 11 months ago
yuvalkirstain / PickScore
View on GitHub
☆601Dec 21, 2024Updated last year
peterljq / Concept-Lancet
View on GitHub
The dataset CoLan-150K and the concept decomposition in the paper Concept Lancet (CVPR 2025)
☆20Jan 18, 2026Updated 6 months ago
omer11a / bounded-attention
View on GitHub
☆96Sep 22, 2024Updated last year
bytedance / MoMA
View on GitHub
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
☆234Jul 11, 2024Updated 2 years ago
Franklin-Zhang0 / ReasonGen-R1
View on GitHub
Official respository for ReasonGen-R1
☆75Jun 23, 2025Updated last year
HITsz-TMG / Agentic-CIGEval
View on GitHub
Code of our paper "A Unified Agentic Framework for Evaluating Conditional Image Generation".
☆31Jul 22, 2025Updated last year
merlresearch / TI2V-Zero
View on GitHub
Text-conditioned image-to-video generation based on diffusion models.
☆55Jun 13, 2024Updated 2 years ago
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,437May 7, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
YangLing0818 / IterComp
View on GitHub
[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
☆203Feb 19, 2025Updated last year
ChenyuHeidiZhang / VL-commonsense
View on GitHub
☆14May 23, 2022Updated 4 years ago
stepfun-ai / Step1X-Edit
View on GitHub
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gem…
☆2,240Apr 29, 2026Updated 3 months ago
agentic-learning-ai-lab / procreate-diffusion
View on GitHub
Public code release for the paper "ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation"
☆43Jun 7, 2026Updated last month
louisYen / Gen4Gen
View on GitHub
🏞️ Official implementation of "Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition"
☆110Mar 27, 2026Updated 4 months ago
AhmedImtiazPrio / magnet-polarity
View on GitHub
Official repository for Polarity Sampling, CVPR 2022 ORAL
☆13Jul 25, 2022Updated 4 years ago
OSU-NLP-Group / MagicBrush
View on GitHub
[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
☆411Feb 20, 2025Updated last year