Osilly/Interleaving-Reasoning-Generation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Osilly/Interleaving-Reasoning-Generation)

Osilly / Interleaving-Reasoning-Generation

[ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performance. It also significantly improves the quality, fine-grained details and aesthetic aspects of generated images.

☆100

Alternatives and similar repositories for Interleaving-Reasoning-Generation

Users that are interested in Interleaving-Reasoning-Generation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / GenEval2
View on GitHub
Evaluation codes and data for GenEval2
☆80Jan 8, 2026Updated 6 months ago
Fr0zenCrane / UniCoT
View on GitHub
[ICLR 2026] Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
☆233May 31, 2026Updated last month
Osilly / Awesome-Interleaving-Reasoning
View on GitHub
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
☆280Jun 5, 2026Updated last month
QC-LY / UiG
View on GitHub
Code for "Understanding-in-Generation:Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation"
☆15Nov 11, 2025Updated 8 months ago
PKU-YuanGroup / UniSandBox
View on GitHub
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
☆60Nov 27, 2025Updated 7 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
rongyaofang / prism-bench
View on GitHub
This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…
☆131Jan 29, 2026Updated 5 months ago
wdrink / SimpleAR
View on GitHub
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆431Jun 20, 2025Updated last year
Andrew0613 / PICABench
View on GitHub
PICABench: How Far Are We from Physically Realistic Image Editing?
☆39Nov 5, 2025Updated 8 months ago
HorizonWind2004 / reconstruction-alignment
View on GitHub
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…
☆410May 23, 2026Updated 2 months ago
zhentao-zou / MURE
View on GitHub
Beyond Textual CoT: Interleaved Text-image chains with Deep Confidence Reasoning for Image Editing
☆19Jun 24, 2026Updated 3 weeks ago
zhengdian1 / AIA
View on GitHub
☆45Jan 4, 2026Updated 6 months ago
FrankYang-17 / RealUnify
View on GitHub
☆27Oct 10, 2025Updated 9 months ago
showlab / UniRL
View on GitHub
The code repository of UniRL
☆53May 30, 2025Updated last year
LeapLabTHU / AdaNAT
View on GitHub
[ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
☆37Sep 12, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
spatigen / milr
View on GitHub
Official code of paper: MILR: Improving Multimodal Image Generation via Test-Time Latent Reasoning
☆18Feb 12, 2026Updated 5 months ago
arctanxarc / GENIUS
View on GitHub
☆42May 9, 2026Updated 2 months ago
wendell0218 / Janus-Pro-R1
View on GitHub
[NeurIPS 2025] Official repository of the paper "Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Compreh…
☆23Sep 27, 2025Updated 9 months ago
PhoenixZ810 / RISEBench
View on GitHub
[NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
☆155May 18, 2026Updated 2 months ago
CUC-MIPG / UnifyEdit
View on GitHub
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
☆13Dec 29, 2024Updated last year
baaivision / Emu3.5
View on GitHub
Native Multimodal Models are World Learners
☆1,537Dec 30, 2025Updated 6 months ago
ZiyuGuo99 / Thinking-while-Generating
View on GitHub
The first Interleaved framework for textual reasoning within the visual generation process
☆164Mar 16, 2026Updated 4 months ago
GAIR-NLP / thinking-with-generated-images
View on GitHub
Doodling our way to AGI ✏️ 🖼️ 🧠
☆128May 29, 2025Updated last year
TencentARC / MindOmni
View on GitHub
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
☆139Oct 15, 2025Updated 9 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
LeapLabTHU / AdaGen
View on GitHub
Official code for "AdaGen: Learning Adaptive Policy for Image Synthesis"
☆15Mar 18, 2026Updated 4 months ago
VisualSphinx / VisualSphinx
View on GitHub
☆17Jun 3, 2025Updated last year
inclusionAI / Ming-UniVision
View on GitHub
Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer
☆143Oct 14, 2025Updated 9 months ago
facebookresearch / tuna-2
View on GitHub
Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation
☆739Updated this week
CSfufu / Revisual-R1
View on GitHub
[ICLR 2026]🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, mul…
☆212Dec 10, 2025Updated 7 months ago
ZiyuGuo99 / Image-Generation-CoT
View on GitHub
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
☆865Mar 19, 2026Updated 4 months ago
JiuhaiChen / BLIP3o
View on GitHub
Official implementation of BLIP3o-Series
☆1,663Nov 29, 2025Updated 7 months ago
shawn0728 / ARES
View on GitHub
[ICLR 2026]🌴 ARES is an open-source framework for adaptive multimodal reasoning, featuring a two-stage pipeline—Adaptive Cold-Start and …
☆22Feb 3, 2026Updated 5 months ago
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
PKU-YuanGroup / ImgEdit
View on GitHub
[NeurIPS 2025 D&B🔥] ImgEdit: A Unified Image Editing Dataset and Benchmark
☆327Nov 5, 2025Updated 8 months ago
UCSC-VLAA / Complex-Edit
View on GitHub
Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark
☆29Apr 22, 2025Updated last year
Franklin-Zhang0 / ReasonGen-R1
View on GitHub
Official respository for ReasonGen-R1
☆75Jun 23, 2025Updated last year
cheryyunl / ROVER
View on GitHub
Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation
☆26Dec 12, 2025Updated 7 months ago
CaraJ7 / T2I-R1
View on GitHub
[NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
☆433Sep 18, 2025Updated 10 months ago
PKU-YuanGroup / UniWorld
View on GitHub
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
☆883Dec 23, 2025Updated 7 months ago
ByteDance-Seed / Bagel
View on GitHub
Open-source unified multimodal model
☆6,109May 4, 2026Updated 2 months ago