erwold/qwen2vl-flux

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/erwold/qwen2vl-flux)

erwold / qwen2vl-flux

☆571

Alternatives and similar repositories for qwen2vl-flux

Users that are interested in qwen2vl-flux are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yuanshi9815 / OminiControl
View on GitHub
[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer
☆1,925Jul 2, 2026Updated 2 weeks ago
instantX-research / Regional-Prompting-FLUX
View on GitHub
Training-free Regional Prompting for Diffusion Transformers 🔥
☆696Nov 28, 2024Updated last year
lehduong / OneDiffusion
View on GitHub
Official implementation of OneDiffusion paper (CVPR 2025)
☆662Dec 14, 2024Updated last year
alimama-creative / FLUX-Controlnet-Inpainting
View on GitHub
☆793Nov 22, 2024Updated last year
ali-vilab / In-Context-LoRA
View on GitHub
Official repository of In-Context LoRA for Diffusion Transformers
☆2,078Dec 20, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
VectorSpaceLab / OmniGen
View on GitHub
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
☆4,330Dec 4, 2025Updated 7 months ago
bytedance / UNO
View on GitHub
[ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
☆1,359Sep 12, 2025Updated 10 months ago
XLabs-AI / x-flux
View on GitHub
☆2,232Nov 8, 2024Updated last year
LituRout / RF-Inversion
View on GitHub
Rectified Flow Inversion (RF-Inversion) - ICLR 2025
☆477Mar 19, 2025Updated last year
ToTheBeginning / PuLID
View on GitHub
[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
☆3,546Jul 31, 2025Updated 11 months ago
TencentQQGYLab / ELLA
View on GitHub
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
☆1,285Jul 17, 2024Updated 2 years ago
PKU-YuanGroup / UniWorld
View on GitHub
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
☆883Dec 23, 2025Updated 6 months ago
FireRedTeam / StoryMaker
View on GitHub
StoryMaker: Towards consistent characters in text-to-image generation
☆718Dec 2, 2024Updated last year
Alpha-VLLM / Lumina-T2X
View on GitHub
Lumina-T2X is a unified framework for Text to Any Modality Generation
☆2,247Feb 16, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
logtd / ComfyUI-Fluxtapoz
View on GitHub
Nodes for image juxtaposition for Flux in ComfyUI
☆1,396Jan 9, 2025Updated last year
instantX-research / CSGO
View on GitHub
CSGO: Content-Style Composition in Text-to-Image Generation 🔥
☆391Sep 5, 2024Updated last year
chenllliang / DreamEngine
View on GitHub
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!
☆123Mar 4, 2025Updated last year
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
FoundationVision / Infinity
View on GitHub
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
☆1,579Apr 16, 2026Updated 3 months ago
Huage001 / CLEAR
View on GitHub
[NeurIPS 2025] Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
☆219Sep 27, 2025Updated 9 months ago
bytedance / SuperEdit
View on GitHub
[ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editing
☆165Jun 26, 2025Updated last year
baaivision / Emu3
View on GitHub
Next-Token Prediction is All You Need
☆2,432Jan 12, 2026Updated 6 months ago
junjiehe96 / UniPortrait
View on GitHub
[ICCV2025] UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization
☆275May 1, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
fenfenfenfan / VMix
View on GitHub
Official code for VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control
☆191Dec 31, 2024Updated last year
PKU-YuanGroup / ConsisID
View on GitHub
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
☆849Apr 14, 2026Updated 3 months ago
zai-org / VisionReward
View on GitHub
[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
☆420Mar 26, 2025Updated last year
Tencent-Hunyuan / HunyuanDiT
View on GitHub
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
☆4,292Nov 27, 2025Updated 7 months ago
stepfun-ai / Step1X-Edit
View on GitHub
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gem…
☆2,235Apr 29, 2026Updated 2 months ago
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,420May 7, 2026Updated 2 months ago
ali-vilab / ACE_plus
View on GitHub
☆1,367Apr 21, 2025Updated last year
River-Zhang / ICEdit
View on GitHub
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ …
☆2,100Dec 19, 2025Updated 7 months ago
Yuanshi9815 / Subjects200K
View on GitHub
Subjects200K dataset
☆132Jan 17, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
JiuhaiChen / BLIP3o
View on GitHub
Official implementation of BLIP3o-Series
☆1,663Nov 29, 2025Updated 7 months ago
fallenshock / FlowEdit
View on GitHub
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
☆1,008May 27, 2026Updated last month
PixArt-alpha / PixArt-alpha
View on GitHub
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
☆3,298Oct 31, 2024Updated last year
Phantom-video / Phantom
View on GitHub
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
☆1,509Sep 11, 2025Updated 10 months ago
zai-org / CogView4
View on GitHub
CogView4, CogView3-Plus and CogView3(ECCV 2024)
☆1,101Mar 29, 2025Updated last year
ZiyuGuo99 / Image-Generation-CoT
View on GitHub
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
☆865Mar 19, 2026Updated 4 months ago
FireRedTeam / LayerDiffuse-Flux
View on GitHub
☆246May 9, 2025Updated last year