huggingface / open-museLinks
Open reproduction of MUSE for fast text2image generation.
☆350Updated last year
Alternatives and similar repositories for open-muse
Users that are interested in open-muse are comparing it to the libraries listed below
Sorting:
- Simple large-scale training of stable diffusion with multi-node support.☆133Updated 2 years ago
- Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"☆406Updated last year
- Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis☆315Updated last year
- Huggingface-compatible SDXL Unet implementation that is readily hackable☆423Updated last year
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆220Updated last year
- ☆171Updated last year
- Large-scale text-video dataset. 10 million captioned short videos.☆639Updated 9 months ago
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆314Updated last year
- AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more samp…☆285Updated 7 months ago
- ☆506Updated 5 months ago
- Official Jax Implementation of MaskGIT☆513Updated 2 years ago
- Better Aligning Text-to-Image Models with Human Preference. ICCV 2023☆283Updated last year
- MoVQGAN - model for the image encoding and reconstruction☆240Updated last year
- Official implementation of "Controlling Text-to-Image Diffusion by Orthogonal Finetuning".☆292Updated 7 months ago
- Code for instruction-tuning Stable Diffusion.☆232Updated last year
- An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch☆305Updated last month
- PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.☆429Updated last year
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆110Updated 2 years ago
- DataComp: In search of the next generation of multimodal datasets☆710Updated last month
- Easily create large video dataset from video urls☆611Updated 10 months ago
- A linear estimator on top of clip to predict the aesthetic quality of pictures☆552Updated 2 years ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆409Updated 6 months ago
- An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal …☆360Updated last year
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch☆275Updated 10 months ago
- Diffusion Reinforcement Learning Library☆185Updated last year
- Official implementation of SEED-LLaMA (ICLR 2024).☆612Updated 8 months ago
- Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV …☆280Updated last year
- Official implementation of Würstchen: Efficient Pretraining of Text-to-Image Models☆546Updated last year
- Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning☆299Updated 10 months ago
- Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…☆202Updated 9 months ago