CaraJ7/T2I-R1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CaraJ7/T2I-R1)

CaraJ7 / T2I-R1

[NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

☆433

Alternatives and similar repositories for T2I-R1

Users that are interested in T2I-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZiyuGuo99 / Image-Generation-CoT
View on GitHub
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
☆865Mar 19, 2026Updated 4 months ago
wdrink / SimpleAR
View on GitHub
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆431Jun 20, 2025Updated last year
rongyaofang / GoT
View on GitHub
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆317Sep 28, 2025Updated 9 months ago
CaraJ7 / DraCo
View on GitHub
Offical Repository for Paper: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
☆17Dec 7, 2025Updated 7 months ago
JiuhaiChen / BLIP3o
View on GitHub
Official implementation of BLIP3o-Series
☆1,663Nov 29, 2025Updated 7 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
ULMEvalKit / ULMEvalKit
View on GitHub
ULMEvalKit: One-Stop Eval ToolKit for Image Generation
☆56Dec 17, 2025Updated 7 months ago
Diffusion-CoT / ReflectionFlow
View on GitHub
[ICCV 2025] Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
☆220Nov 5, 2025Updated 8 months ago
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,420May 7, 2026Updated 2 months ago
FoundationVision / UniTok
View on GitHub
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
☆529Nov 14, 2025Updated 8 months ago
showlab / Show-o
View on GitHub
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
☆1,963Jan 8, 2026Updated 6 months ago
PKU-YuanGroup / WISE
View on GitHub
[ICML 2026🔥] WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
☆212Jun 26, 2026Updated 3 weeks ago
csuhan / Tar
View on GitHub
[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
☆202Sep 18, 2025Updated 10 months ago
stepfun-ai / NextStep-1
View on GitHub
[🚀 ICLR 2026 Oral] NextStep-1: SOTA Autogressive Image Generation with Continuous Tokens. A research project developed by the StepFun’s …
☆689Feb 27, 2026Updated 4 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
PKU-YuanGroup / UniWorld
View on GitHub
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
☆883Dec 23, 2025Updated 6 months ago
SilentView / GigaTok
View on GitHub
[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
☆204Jan 7, 2026Updated 6 months ago
VARGPT-family / VARGPT-v1.1
View on GitHub
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
☆271Apr 15, 2025Updated last year
FoundationVision / Liquid
View on GitHub
(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators
☆642Jun 1, 2026Updated last month
MME-Benchmarks / MME-CoT
View on GitHub
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆136Aug 5, 2025Updated 11 months ago
ZiyuGuo99 / Thinking-while-Generating
View on GitHub
The first Interleaved framework for textual reasoning within the visual generation process
☆164Mar 16, 2026Updated 4 months ago
bytedance / 1d-tokenizer
View on GitHub
This repo contains the code for 1D tokenizer and generator
☆1,165Mar 20, 2025Updated last year
FoundationVision / Infinity
View on GitHub
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
☆1,579Apr 16, 2026Updated 3 months ago
PKU-YuanGroup / ImgEdit
View on GitHub
[NeurIPS 2025 D&B🔥] ImgEdit: A Unified Image Editing Dataset and Benchmark
☆326Nov 5, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
XueZeyue / DanceGRPO
View on GitHub
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
☆1,635Oct 16, 2025Updated 9 months ago
ByteDance-Seed / Bagel
View on GitHub
Open-source unified multimodal model
☆6,103May 4, 2026Updated 2 months ago
FoundationVision / LlamaGen
View on GitHub
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
☆1,959Aug 15, 2024Updated last year
ShoufaChen / PixelFlow
View on GitHub
Pixel-Space Generative Models
☆316May 11, 2025Updated last year
zai-org / VisionReward
View on GitHub
[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
☆421Mar 26, 2025Updated last year
Gen-Verse / MMaDA
View on GitHub
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)
☆1,660Feb 14, 2026Updated 5 months ago
Franklin-Zhang0 / ReasonGen-R1
View on GitHub
Official respository for ReasonGen-R1
☆75Jun 23, 2025Updated last year
CaraJ7 / CoMat
View on GitHub
[NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
☆169Nov 18, 2024Updated last year
MizzenAI / HPSv3
View on GitHub
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
☆325Dec 5, 2025Updated 7 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
shawn0728 / Unify-Agent
View on GitHub
🐧 Unify-Agent: An end-to-end unified multimodal agent for faithful, knowledge-grounded image generation.
☆86May 2, 2026Updated 2 months ago
facebookresearch / GenEval2
View on GitHub
Evaluation codes and data for GenEval2
☆80Jan 8, 2026Updated 6 months ago
NVlabs / DiffusionNFT
View on GitHub
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
☆974Feb 10, 2026Updated 5 months ago
ATH-MaaS / Ovis-U1
View on GitHub
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerfu…
☆450Dec 2, 2025Updated 7 months ago
HorizonWind2004 / reconstruction-alignment
View on GitHub
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…
☆410May 23, 2026Updated last month
PrimitiveAnything / PrimitiveAnything
View on GitHub
[SIGGRAPH 2025] PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer
☆389Apr 17, 2026Updated 3 months ago
shiyi-zh0408 / FlexiAct
View on GitHub
[SIGGRAPH 2025] Official code of the paper "FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios"
☆341Oct 30, 2025Updated 8 months ago