FoundationVision / UniTokLinks
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
β442Updated this week
Alternatives and similar repositories for UniTok
Users that are interested in UniTok are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".β398Updated 3 months ago
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"β413Updated 4 months ago
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"β294Updated last month
- [ICLR 2025] Autoregressive Video Generation without Vector Quantizationβ593Updated 2 weeks ago
- VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learningβ268Updated 7 months ago
- Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoningβ228Updated 5 months ago
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuningβ220Updated 6 months ago
- Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Thinkβ605Updated this week
- High-performance Image Tokenizers for VAR and ARβ296Updated 6 months ago
- β162Updated 4 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationβ406Updated 6 months ago
- Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"β279Updated 6 months ago
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generationβ177Updated 5 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generationβ160Updated last week
- [ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"β192Updated 4 months ago
- [ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformersβ391Updated 3 weeks ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generatβ¦β237Updated last month
- This is a repo to track the latest autoregressive visual generation papers.β409Updated 4 months ago
- Structured Video Comprehension of Real-World Shortsβ215Updated last month
- Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unifieβ¦β302Updated 3 weeks ago
- Implements VAR+CLIP for text-to-image (T2I) generation