[CVPR 2025] π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
β465Aug 8, 2025Updated 10 months ago
Alternatives and similar repositories for TokenFlow
Users that are interested in TokenFlow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understandingβ527Nov 14, 2025Updated 7 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationβ425Apr 25, 2025Updated last year
- SEED-Voken: A Series of Powerful Visual Tokenizersβ1,012Nov 25, 2025Updated 7 months ago
- High-performance Image Tokenizers for VAR and ARβ307Apr 25, 2025Updated last year
- Autoregressive Model Beats Diffusion: π¦ Llama for Scalable Image Generationβ1,957Aug 15, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.β1,956Jan 8, 2026Updated 5 months ago
- π This is a repository for organizing papers, codes and other resources related to unified multimodal models.β828Oct 10, 2025Updated 8 months ago
- [ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"β204Jan 7, 2026Updated 5 months ago
- [TMLR 2025π₯] A survey for the autoregressive models in vision.β801May 5, 2026Updated last month
- This repo contains the code for 1D tokenizer and generatorβ1,162Mar 20, 2025Updated last year
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"β429Jun 20, 2025Updated last year
- [ICLR 2025] Autoregressive Video Generation without Vector Quantizationβ652Oct 29, 2025Updated 8 months ago
- [CVPR 2025 Oral]Infinity β : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesisβ1,576Apr 16, 2026Updated 2 months ago
- PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838β1,939Feb 20, 2026Updated 4 months ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretrainiβ¦β647Oct 16, 2025Updated 8 months ago
- Official implementation of BLIP3o-Seriesβ1,658Nov 29, 2025Updated 6 months ago
- [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Modelsβ326Apr 24, 2025Updated last year
- Next-Token Prediction is All You Needβ2,423Jan 12, 2026Updated 5 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-projectβ186Mar 20, 2025Updated last year
- (Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generatorsβ643Jun 1, 2026Updated 3 weeks ago
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformerβ648Oct 16, 2024Updated last year
- This is a repo to track the latest autoregressive visual generation papers.β430Jun 25, 2025Updated last year
- β196Dec 17, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- EVE Series: Encoder-Free Vision-Language Models from BAAIβ372Jul 24, 2025Updated 11 months ago
- This is the official implementation for ControlVAR.β129Dec 10, 2024Updated last year
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generatβ¦β251Oct 12, 2025Updated 8 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Surveyβ477Jan 17, 2025Updated last year
- Implements VAR+CLIP for text-to-image (T2I) generationβ147Jan 23, 2025Updated last year
- Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAIβ1,375Jan 27, 2026Updated 5 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"β206Jul 14, 2025Updated 11 months ago
- Multimodal Models in Real Worldβ557Feb 24, 2025Updated last year
- [NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.β324Jul 9, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generationβ97Mar 1, 2025Updated last year
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"β315Sep 28, 2025Updated 9 months ago
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"β194Feb 24, 2026Updated 4 months ago
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Lengthβ321Jun 2, 2025Updated last year
- β322May 29, 2025Updated last year
- [COLM'25] Official implementation of the Law of Vision Representation in MLLMsβ177Oct 6, 2025Updated 8 months ago
- [CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Modelsβ1,498Dec 16, 2025Updated 6 months ago