AFeng-x/PixWizard

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AFeng-x/PixWizard)

AFeng-x / PixWizard

[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from user instructions.

☆210

Alternatives and similar repositories for PixWizard

Users that are interested in PixWizard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TIGER-AI-Lab / OmniEdit
View on GitHub
Official Repo for Paper "OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision" [ICLR2025]
☆144Jan 27, 2025Updated last year
UCSC-VLAA / Complex-Edit
View on GitHub
Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark
☆29Apr 22, 2025Updated last year
fenfenfenfan / VMix
View on GitHub
Official code for VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control
☆191Dec 31, 2024Updated last year
Alpha-VLLM / Lumina-mGPT
View on GitHub
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…
☆646Oct 16, 2025Updated 9 months ago
fenghora / personalize-anything
View on GitHub
[AAAI 2026] Personalize Anything for Free with Diffusion Transformer
☆361Mar 26, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
hustvl / ControlAR
View on GitHub
[ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models
☆326Jun 30, 2026Updated 3 weeks ago
siso-paper / SISO
View on GitHub
Official implementation of "Single Image Iterative Subject-driven Generation and Editing".
☆99May 30, 2025Updated last year
lehduong / OneDiffusion
View on GitHub
Official implementation of OneDiffusion paper (CVPR 2025)
☆662Dec 14, 2024Updated last year
FoundationVision / LlamaGen
View on GitHub
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
☆1,959Aug 15, 2024Updated last year
showlab / MakeAnything
View on GitHub
Official code of "MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation"
☆211Apr 1, 2025Updated last year
wyhlovecpp / GPT-Image-Edit
View on GitHub
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
☆243Aug 15, 2025Updated 11 months ago
mycfhs / DreamMix
View on GitHub
The official implementation of paper: DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting
☆121Jan 2, 2025Updated last year
baaivision / Emu3
View on GitHub
Next-Token Prediction is All You Need
☆2,432Jan 12, 2026Updated 6 months ago
AILab-CVC / SEED-X
View on GitHub
Multimodal Models in Real World
☆558Feb 24, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Yuanshi9815 / OminiControl
View on GitHub
[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer
☆1,925Jul 2, 2026Updated 2 weeks ago
sayakpaul / flux-image-editing
View on GitHub
Scripts to teach Flux the task of image editing from language with the Flux Control framework.
☆102Jun 30, 2025Updated last year
OpenGVLab / Diffree
View on GitHub
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
☆239May 5, 2025Updated last year
PKU-YuanGroup / UniWorld
View on GitHub
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
☆883Dec 23, 2025Updated 6 months ago
Alpha-VLLM / Lumina-T2X
View on GitHub
Lumina-T2X is a unified framework for Text to Any Modality Generation
☆2,247Feb 16, 2025Updated last year
MiracleDance / CAR
View on GitHub
CAR: Controllable AutoRegressive Modeling for Visual Generation
☆129Nov 29, 2024Updated last year
TencentARC / SEED-Voken
View on GitHub
SEED-Voken: A Series of Powerful Visual Tokenizers
☆1,016Nov 25, 2025Updated 7 months ago
HaozheZhao / UltraEdit
View on GitHub
☆272Jul 23, 2024Updated last year
TempleX98 / EasyRef
View on GitHub
[ICML 2025] EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
☆73Jul 16, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
alipay / style-tokenizer
View on GitHub
☆112Jul 9, 2024Updated 2 years ago
bytedance / SuperEdit
View on GitHub
[ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editing
☆165Jun 26, 2025Updated last year
stepfun-ai / Step1X-Edit
View on GitHub
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gem…
☆2,236Apr 29, 2026Updated 2 months ago
wtybest / FreeFlux
View on GitHub
[ICCV 2025] FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing
☆77Mar 7, 2026Updated 4 months ago
VIPL-GENUN / JoPano
View on GitHub
JoPano: Unified Panorama Generation via Joint Modeling
☆24Mar 6, 2026Updated 4 months ago
xyfJASON / ctrlora
View on GitHub
[ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"
☆268Mar 6, 2026Updated 4 months ago
ai-med / StablePose
View on GitHub
Official Pytorch Implementation of Paper - Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation - NeurIPS 2024
☆111Dec 23, 2024Updated last year
showlab / Show-o
View on GitHub
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
☆1,963Jan 8, 2026Updated 6 months ago
rongyaofang / PUMA
View on GitHub
Empowering Unified MLLM with Multi-granular Visual Generation
☆132Jan 16, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
wangjiangshan0725 / RF-Solver-Edit
View on GitHub
[🚀ICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!
☆637May 1, 2025Updated last year
VectorSpaceLab / OmniGen
View on GitHub
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
☆4,331Dec 4, 2025Updated 7 months ago
ali-vilab / FreeScale
View on GitHub
[ICCV 2025] Code for FreeScale, a tuning-free method for higher-resolution visual generation
☆148Oct 9, 2025Updated 9 months ago
yeates / OmniPaint
View on GitHub
[ICCV 25] OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
☆328Mar 27, 2026Updated 3 months ago
rongyaofang / GoT
View on GitHub
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆317Sep 28, 2025Updated 9 months ago
ZiyuGuo99 / Image-Generation-CoT
View on GitHub
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
☆865Mar 19, 2026Updated 4 months ago
FoundationVision / Infinity
View on GitHub
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
☆1,579Apr 16, 2026Updated 3 months ago