yinboc / dito
Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"
☆108Updated last month
Alternatives and similar repositories for dito:
Users that are interested in dito are comparing it to the libraries listed below
- ☆147Updated 3 months ago
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆111Updated last month
- The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024☆103Updated 5 months ago
- EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.☆80Updated last month
- TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation☆49Updated last week
- [CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Mo…☆143Updated last week
- “FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…☆95Updated 3 months ago
- Official Implementation for Diffusion Models Without Classifier-free Guidance☆102Updated last month
- (CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models☆101Updated this week
- ☆50Updated 2 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆61Updated 4 months ago
- [NeurIPS 24] Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models☆36Updated 5 months ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generat…☆143Updated 3 weeks ago
- "SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow", Yuanzhi Zhu, Xingchao Liu, Qiang Liu☆49Updated 4 months ago
- [CVPR 2024] On the Content Bias in Fréchet Video Distance☆103Updated 5 months ago
- ☆68Updated 4 months ago
- Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"☆60Updated last month
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆56Updated last month
- ☆85Updated this week
- ☆44Updated 2 weeks ago
- ☆167Updated last month
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆50Updated last month
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆64Updated 4 months ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆86Updated 5 months ago
- ☆27Updated 2 weeks ago
- Official Github Repo for Neurips 2024 Paper Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment☆47Updated this week
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆49Updated 8 months ago
- Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"☆29Updated last year
- Towards training VQ-VAE models robustly!☆62Updated 2 months ago
- The official implementation of "[MASK] is All You Need"☆115Updated 3 weeks ago