ByteVisionLab / TokenFlowLinks
[CVPR 2025] π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
β379Updated last month
Alternatives and similar repositories for TokenFlow
Users that are interested in TokenFlow are comparing it to the libraries listed below
Sorting:
- A Unified Tokenizer for Visual Generation and Understandingβ397Updated last month
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"β402Updated 2 months ago
- Official implementation of UnifiedReward & UnifiedReward-Thinkβ534Updated this week
- High-performance Image Tokenizers for VAR and ARβ288Updated 4 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generationβ146Updated last month
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuningβ209Updated 5 months ago
- This is a repo to track the latest autoregressive visual generation papers.β394Updated 2 months ago
- π This is a repository for organizing papers, codes, and other resources related to unified multimodal models.β288Updated last week
- Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoningβ210Updated 3 months ago
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"β285Updated 4 months ago
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generationβ162Updated 3 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationβ383Updated 4 months ago
- Implements VAR+CLIP for text-to-image (T2I) generationβ148Updated 7 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"β194Updated 2 months ago
- β154Updated 2 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generationβ206Updated last month
- VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learningβ264Updated 5 months ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generatβ¦β234Updated 4 months ago
- [ICLR 2025] Diffusion Feedback Helps CLIP See Betterβ289Updated 7 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantizationβ567Updated last week
- Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unifieβ¦β206Updated this week
- Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"β257Updated 4 months ago
- π This is a repository for organizing papers, codes and other resources related to unified multimodal models.β688Updated last month
- [ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformersβ339Updated 2 months ago
- βFlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matchingβ FlowAR employs a simplest scale design and is compatible with anβ¦β150Updated 4 months ago
- EVE Series: Encoder-Free Vision-Language Models from BAAIβ350Updated last month
- [ICCV25] USP: Unified Self-Supervised Pretraining for Image Generation and Understandingβ89Updated 2 months ago
- β179Updated 9 months ago
- Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editingβ89Updated last week
- [ICLR'25] Reconstructive Visual Instruction Tuningβ116Updated 5 months ago