shawn0728/Unify-Agent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shawn0728/Unify-Agent)

shawn0728 / Unify-Agent

🐧 Unify-Agent: An end-to-end unified multimodal agent for faithful, knowledge-grounded image generation.

☆86

Alternatives and similar repositories for Unify-Agent

Users that are interested in Unify-Agent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MeiGen-AI / GenEvolve
View on GitHub
Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation
☆77May 22, 2026Updated 2 months ago
tulerfeng / Gen-Searcher
View on GitHub
Gen-Searcher: Reinforcing Agentic Search for Image Generation
☆376Apr 7, 2026Updated 3 months ago
yczhou001 / LongBench-T2I
View on GitHub
Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation
☆23Sep 24, 2025Updated 10 months ago
langmanbusi / CoCoEdit
View on GitHub
[ICML 2026] Official PyTorch implementation of paper “CoCoEdit: Content-Consistent Image Editing via Region Regularized Reinforcement Lea…
☆26Jun 14, 2026Updated last month
shawn0728 / OpenSearch-VL
View on GitHub
🔍 OpenSearch-VL provides a fully open recipe for training strong multimodal deep search agents through high-quality data curation, diver…
☆256May 19, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
shawn0728 / ARES
View on GitHub
[ICLR 2026]🌴 ARES is an open-source framework for adaptive multimodal reasoning, featuring a two-stage pipeline—Adaptive Cold-Start and …
☆22Feb 3, 2026Updated 5 months ago
lcqysl / GEMS
View on GitHub
GEMS: Agent-Native Multimodal Generation with Memory and Skills
☆139Apr 1, 2026Updated 3 months ago
CostaliyA / Flow-OPD
View on GitHub
Official Repo of "Flow-OPD: On-Policy Distillation for Flow Matching Models"
☆265Jun 24, 2026Updated last month
G-U-N / UniRL
View on GitHub
[ICML 2026] a unified reinforcement learning toolbox for joint RL on language models and diffusion models
☆91May 26, 2026Updated last month
HorizonWind2004 / reconstruction-alignment
View on GitHub
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…
☆411May 23, 2026Updated 2 months ago
Tencent-Hunyuan / SAGE-GRPO
View on GitHub
Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation
☆126Apr 2, 2026Updated 3 months ago
mm-vl / ULM-R1
View on GitHub
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
☆48Jul 22, 2025Updated last year
yczhou001 / PF-OPSD
View on GitHub
World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning
☆21Jun 3, 2026Updated last month
HKUST-C4G / diffusion-rm
View on GitHub
The official code of "Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling"
☆66Jun 30, 2026Updated 3 weeks ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
iGuoYanjun / Memorize-When-Needed
View on GitHub
☆23Jun 29, 2026Updated 3 weeks ago
micky-li-hd / CoCo
View on GitHub
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation
☆54Apr 9, 2026Updated 3 months ago
ljzycmd / SCD
View on GitHub
Consistent Human Image and Video Generation with Spatially Conditioned Diffusion
☆16Sep 1, 2025Updated 10 months ago
vvvvvjdy / dmdr
View on GitHub
[ECCV 2026] Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"
☆285Feb 1, 2026Updated 5 months ago
KaiyueSun98 / T2I-ReasonBench
View on GitHub
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
☆37Sep 16, 2025Updated 10 months ago
Luo-Yihong / TDM-R1
View on GitHub
[ICML 2026][Ultra Powerful Few-Step Diffusion RL] TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
☆116May 25, 2026Updated 2 months ago
ChoS3nE11ven / Agentic-MME
View on GitHub
☆36Apr 13, 2026Updated 3 months ago
PicoTrex / Mind-Brush
View on GitHub
Implement search image generation similar to Nano-banana-pro / Seedream / FLUX. [SIGGRAPH Asia 2026]
☆96Mar 10, 2026Updated 4 months ago
CaraJ7 / T2I-R1
View on GitHub
[NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
☆433Sep 18, 2025Updated 10 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
PKU-YuanGroup / Edit-R1
View on GitHub
Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback
☆295Jan 24, 2026Updated 6 months ago
facebookresearch / GenEval2
View on GitHub
Evaluation codes and data for GenEval2
☆80Jan 8, 2026Updated 6 months ago
PKU-YuanGroup / WISE
View on GitHub
[ICML 2026🔥] WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
☆212Jun 26, 2026Updated 3 weeks ago
huangrh99 / AlphaGRPO
View on GitHub
[ICML2026] Official Implementation of AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in Unified Multimodal Models via Decompo…
☆73Jul 14, 2026Updated last week
kxfan2002 / Reagent
View on GitHub
Agent-RRM: Exploring Reasoning Reward Model for Agents
☆70Mar 17, 2026Updated 4 months ago
taco-group / 4KLSDB
View on GitHub
[CVPR 2026 DataCV Workshop] 4KLSDB: A Large-Scale Native-4K Dataset and Benchmark for Image Restoration and Generation.
☆23May 28, 2026Updated last month
zjr2000 / SPES
View on GitHub
Official Implementation for paper "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm"
☆23May 8, 2026Updated 2 months ago
HKU-MMLab / Macro
View on GitHub
The official repo of "MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data"
☆67Mar 27, 2026Updated 3 months ago
alibaba / OmniDoc-TokenBench
View on GitHub
☆69May 14, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Tencent-Hunyuan / UniRL
View on GitHub
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
☆853Updated this week
wyhlovecpp / GPT-Image-Edit
View on GitHub
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
☆243Aug 15, 2025Updated 11 months ago
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
showlab / Adv-GRPO
View on GitHub
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image…
☆88Feb 26, 2026Updated 4 months ago
PhoenixZ810 / RISEBench
View on GitHub
[NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
☆155May 18, 2026Updated 2 months ago
Osilly / Interleaving-Reasoning-Generation
View on GitHub
[ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…
☆100Jan 26, 2026Updated 5 months ago
JaydenLyh / Reward-Forcing
View on GitHub
[CVPR 2026 Highlight] Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
☆352Dec 15, 2025Updated 7 months ago