ByteFlow-AI / TokenFlow
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
☆273Updated this week
Alternatives and similar repositories for TokenFlow:
Users that are interested in TokenFlow are comparing it to the libraries listed below
- Implements VAR+CLIP for text-to-image (T2I) generation☆123Updated last month
- This is a repo to track the latest autoregressive visual generation papers.☆150Updated this week
- [ICLR25] High-performance Image Tokenizers for VAR and AR☆206Updated 2 weeks ago
- ☆135Updated last month
- Empowering Unified MLLM with Multi-granular Visual Generation☆117Updated last month
- Liquid: Language Models are Scalable and Unified Multi-modal Generators☆67Updated this week
- The paper collections for the autoregressive models in vision.☆419Updated this week
- 📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.☆387Updated last month
- [ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation☆247Updated this week
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆394Updated this week
- [ICLR2025]☆137Updated last month
- This is the official implementation for ControlVAR.☆95Updated 2 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆232Updated last month
- [Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation☆236Updated last month
- A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!☆125Updated last year
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆73Updated 2 weeks ago
- ☆48Updated last week
- [ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxi…☆227Updated 9 months ago
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆159Updated 5 months ago
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆266Updated 2 months ago
- Official implementation of the Law of Vision Representation in MLLMs☆150Updated 3 months ago
- T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation☆65Updated this week
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆300Updated this week
- The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆87Updated 4 months ago
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆66Updated 9 months ago
- The collection of awesome papers on alignment of diffusion models.☆119Updated 2 weeks ago
- Official repository for VisionZip (CVPR 2025)☆240Updated this week