[CVPR 2025] π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
β464Aug 8, 2025Updated 10 months ago
Alternatives and similar repositories for TokenFlow
Users that are interested in TokenFlow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understandingβ525Nov 14, 2025Updated 6 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationβ422Apr 25, 2025Updated last year
- SEED-Voken: A Series of Powerful Visual Tokenizersβ1,011Nov 25, 2025Updated 6 months ago
- High-performance Image Tokenizers for VAR and ARβ307Apr 25, 2025Updated last year
- Autoregressive Model Beats Diffusion: π¦ Llama for Scalable Image Generationβ1,953Aug 15, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.β1,941Jan 8, 2026Updated 5 months ago
- π This is a repository for organizing papers, codes and other resources related to unified multimodal models.β825Oct 10, 2025Updated 7 months ago
- [ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"β205Jan 7, 2026Updated 5 months ago
- [TMLR 2025π₯] A survey for the autoregressive models in vision.β796May 5, 2026Updated last month
- This repo contains the code for 1D tokenizer and generatorβ1,155Mar 20, 2025Updated last year
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"β429Jun 20, 2025Updated 11 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantizationβ650Oct 29, 2025Updated 7 months ago
- [CVPR 2025 Oral]Infinity β : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesisβ1,570Apr 16, 2026Updated last month
- PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838β1,930Feb 20, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretrainiβ¦β647Oct 16, 2025Updated 7 months ago
- Official implementation of BLIP3o-Seriesβ1,660Nov 29, 2025Updated 6 months ago
- [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Modelsβ325Apr 24, 2025Updated last year
- Next-Token Prediction is All You Needβ2,417Jan 12, 2026Updated 4 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-projectβ186Mar 20, 2025Updated last year
- (Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generatorsβ644Jun 1, 2026Updated last week
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformerβ649Oct 16, 2024Updated last year
- This is a repo to track the latest autoregressive visual generation papers.β431Jun 25, 2025Updated 11 months ago
- β195Dec 17, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- EVE Series: Encoder-Free Vision-Language Models from BAAIβ369Jul 24, 2025Updated 10 months ago
- This is the official implementation for ControlVAR.β128Dec 10, 2024Updated last year
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generatβ¦β250Oct 12, 2025Updated 7 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Surveyβ477Jan 17, 2025Updated last year
- Implements VAR+CLIP for text-to-image (T2I) generationβ147Jan 23, 2025Updated last year
- Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAIβ1,366Jan 27, 2026Updated 4 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"β206Jul 14, 2025Updated 10 months ago
- Multimodal Models in Real Worldβ558Feb 24, 2025Updated last year
- [NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.β323Jul 9, 2024Updated last year
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generationβ97Mar 1, 2025Updated last year
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"β315Sep 28, 2025Updated 8 months ago
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"β193Feb 24, 2026Updated 3 months ago
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Lengthβ319Jun 2, 2025Updated last year
- β321May 29, 2025Updated last year
- [COLM'25] Official implementation of the Law of Vision Representation in MLLMsβ177Oct 6, 2025Updated 8 months ago
- [CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Modelsβ1,487Dec 16, 2025Updated 5 months ago