PKU-YuanGroup / UniWorld-V1Links

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

☆716

Alternatives and similar repositories for UniWorld-V1

Users that are interested in UniWorld-V1 are comparing it to the libraries listed below

Sorting:

modelscope / Nexus-Gen
☆275Updated 2 months ago
AIDC-AI / Ovis-U1
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerfu…
☆426Updated 2 months ago
Xilluill / KV-Edit
[ICCV 2025] Official implementation for KV-Edit: Training-Free Image Editing for Precise Background Preservation
☆342Updated 4 months ago
VARGPT-family / VARGPT-v1.1
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
☆267Updated 5 months ago
lzyhha / VisualCloze
[ICCV 2025] VisualCloze: A universal image generation framework that can support a wide range of in-domain tasks and generalize to unseen…
☆257Updated 3 weeks ago
DCDmllm / AnyEdit
【CVPR 2025 Oral】Official Repo for Paper "AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea"
☆190Updated 6 months ago
AFeng-x / PixWizard
[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from u…
☆209Updated 5 months ago
MS-Diffusion / MS-Diffusion
[ICLR 2025] Official implementation of MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
☆297Updated 2 months ago
baaivision / NOVA
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
☆576Updated last month
PKU-YuanGroup / ImgEdit
[NeurIPS 2025 D&B🔥] ImgEdit: A Unified Image Editing Dataset and Benchmark
☆192Updated last month
knightyxp / VideoGrain
[ICLR 2025] VideoGrain: This repo is the official implementation of "VideoGrain: Modulating Space-Time Attention for Multi-Grained Video …
☆154Updated 6 months ago
YangLing0818 / IterComp
[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
☆199Updated 7 months ago
FoundationVision / Waver
A video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.
☆617Updated last month
wdrink / SimpleAR
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆409Updated 3 months ago
fenghora / personalize-anything
Personalize Anything for Free with Diffusion Transformer
☆349Updated 6 months ago
Ji4chenLi / t2v-turbo
Code repository for T2V-Turbo and T2V-Turbo-v2
☆302Updated 8 months ago
HaozheZhao / UltraEdit
☆266Updated last year
HuiZhang0812 / CreatiLayout
[ICCV 2025] CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
☆114Updated 2 months ago
illume-unified-mllm / ILLUME_plus
☆119Updated last month
viiika / Meissonic
[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image…
☆330Updated last week
LPengYang / MotionClone
[ICLR 2025] Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
☆504Updated 3 months ago
FoundationVision / FlashVideo
FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
☆446Updated 7 months ago
VideoVerses / VideoTuna
Let's finetune video generation models!
☆510Updated 3 weeks ago
TencentARC / DiTCtrl
[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Long…
☆303Updated 6 months ago
RockeyCoss / SPO
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
☆252Updated 6 months ago
liuff19 / Video-T1
[ICCV 2025] Video-T1: Test-Time Scaling for Video Generation
☆294Updated 3 months ago
Alpha-VLLM / Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…
☆624Updated 6 months ago
zai-org / VisionReward
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
☆322Updated 6 months ago
MizzenAI / HPSv3
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
☆187Updated last month
AILab-CVC / SEED-X
Multimodal Models in Real World
☆544Updated 7 months ago