Open reproduction of MUSE for fast text2image generation.
☆359Jun 1, 2024Updated last year
Alternatives and similar repositories for open-muse
Users that are interested in open-muse are comparing it to the libraries listed below
Sorting:
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆116Jun 4, 2023Updated 2 years ago
- Official Jax Implementation of MaskGIT☆554Nov 18, 2022Updated 3 years ago
- ☆88Jan 4, 2024Updated 2 years ago
- Code for instruction-tuning Stable Diffusion.☆249Feb 16, 2024Updated 2 years ago
- Fast and controllable text-to-image model.☆41Jun 16, 2023Updated 2 years ago
- MoVQGAN - model for the image encoding and reconstruction☆260Oct 31, 2023Updated 2 years ago
- Official implementation of Würstchen: Efficient Pretraining of Text-to-Image Models☆555Apr 6, 2024Updated last year
- Emu Series: Generative Multimodal Models from BAAI☆1,765Jan 12, 2026Updated last month
- Official JAX implementation of MAGVIT: Masked Generative Video Transformer☆995Jan 17, 2024Updated 2 years ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆640Sep 21, 2024Updated last year
- SEED-Voken: A Series of Powerful Visual Tokenizers☆996Nov 25, 2025Updated 3 months ago
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,936Aug 15, 2024Updated last year
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆603Oct 6, 2024Updated last year
- Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"☆413Mar 25, 2024Updated last year
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆145Feb 11, 2025Updated last year
- 🤗 Unofficial huggingface/diffusers-based implementation of the paper "Training-Free Layout Control with Cross-Attention Guidance".☆42May 24, 2023Updated 2 years ago
- ☆15Apr 20, 2023Updated 2 years ago
- Simple large-scale training of stable diffusion with multi-node support.☆133May 8, 2023Updated 2 years ago
- DataComp: In search of the next generation of multimodal datasets☆772Apr 28, 2025Updated 10 months ago
- This repo contains the code for 1D tokenizer and generator☆1,117Mar 20, 2025Updated 11 months ago
- Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"☆1,473May 31, 2023Updated 2 years ago
- Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"☆1,096Dec 22, 2025Updated 2 months ago
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,281Oct 31, 2024Updated last year
- An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch☆321Apr 7, 2025Updated 10 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆86Jul 16, 2024Updated last year
- [NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.☆322Jul 9, 2024Updated last year
- A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".☆1,096Mar 25, 2023Updated 2 years ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆638Oct 16, 2025Updated 4 months ago
- A suite of image and video neural tokenizers☆1,711Feb 11, 2025Updated last year
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,252Feb 16, 2025Updated last year
- VideoSys: An easy and efficient system for video generation☆2,016Aug 27, 2025Updated 6 months ago
- Consistency Distilled Diff VAE☆2,209Nov 7, 2023Updated 2 years ago
- An open-source framework for training large multimodal models.☆4,068Aug 31, 2024Updated last year
- An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal …☆365Dec 15, 2023Updated 2 years ago
- ☆20Nov 21, 2025Updated 3 months ago
- Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis☆646May 24, 2024Updated last year
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆190Jan 27, 2025Updated last year
- [TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.☆1,918Oct 30, 2025Updated 4 months ago
- [ICLR2025] Halton Scheduler for Masked Generative Image Transformer☆281Oct 28, 2025Updated 4 months ago