☆88Jan 4, 2024Updated 2 years ago
Alternatives and similar repositories for amused
Users that are interested in amused are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open reproduction of MUSE for fast text2image generation.☆359Jun 1, 2024Updated last year
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆116Jun 4, 2023Updated 2 years ago
- A curated list of papers and resources for text-to-image evaluation.☆30Sep 6, 2023Updated 2 years ago
- Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer☆16Nov 21, 2024Updated last year
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Oct 21, 2024Updated last year
- 🤗 Unofficial huggingface/diffusers-based implementation of the paper "Training-Free Layout Control with Cross-Attention Guidance".☆42May 24, 2023Updated 2 years ago
- Consistency Distilled Diff VAE☆2,213Nov 7, 2023Updated 2 years ago
- "FreeU: Free Lunch in Diffusion U-Net" for Huggingface Diffusers☆102Oct 6, 2023Updated 2 years ago
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆147Feb 11, 2025Updated last year
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 3 months ago
- GroupViT: Semantic Segmentation Emerges from Text Supervision☆25Dec 15, 2022Updated 3 years ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆643Oct 16, 2025Updated 5 months ago
- Official Implementation for "Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models" (SIGGRAPH 2023)☆767Jan 26, 2024Updated 2 years ago
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆605Oct 6, 2024Updated last year
- Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation☆23Sep 24, 2025Updated 5 months ago
- [ICLR2025] Halton Scheduler for Masked Generative Image Transformer☆282Oct 28, 2025Updated 4 months ago
- i-mae Pytorch Repo☆20Apr 6, 2024Updated last year
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆214Feb 27, 2024Updated 2 years ago
- [ICLR 2024] Code for FreeNoise based on VideoCrafter☆428Aug 25, 2025Updated 6 months ago
- ☆98Jul 24, 2025Updated 7 months ago
- [NeurIPS2022] Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop☆14Apr 13, 2023Updated 2 years ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆86Jul 16, 2024Updated last year
- ☆19Apr 1, 2025Updated 11 months ago
- Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis☆651May 24, 2024Updated last year
- [ICML 2025 Spotlight] Direct Discriminative Optimization: Reinforcing Diffusion/Autoregressive with GAN Discrimination☆118Jan 27, 2026Updated last month
- [ICLR2025] IV-Mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis☆39Feb 17, 2025Updated last year
- [ICML 2025] Diff-MoE: Diffusion Transformer with Time-Aware and Space-Adaptive Experts☆31Nov 10, 2025Updated 4 months ago
- SEED-Voken: A Series of Powerful Visual Tokenizers☆998Nov 25, 2025Updated 3 months ago
- SDXL GPU cluster scripts☆16Oct 28, 2023Updated 2 years ago
- AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more samp…☆314Nov 1, 2024Updated last year
- ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities…☆120Sep 4, 2025Updated 6 months ago
- [TMLR] Official PyTorch implementation of "λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent…☆53Nov 29, 2024Updated last year
- Reproduction of the first step in the text-to-video model Phenaki. Code and model weights for the Transformer-based autoencoder for video…☆29Aug 4, 2023Updated 2 years ago
- 3D generation on ImageNet [ICLR 2023]☆214May 23, 2023Updated 2 years ago
- [ICCV 2023] Online Clustered Codebook☆184Sep 19, 2024Updated last year
- ControlLoRA Version 2: A Lightweight Neural Network To Control Stable Diffusion Spatial Information Version 2☆112Jul 31, 2024Updated last year
- [ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding☆51Apr 21, 2025Updated 11 months ago
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,281Oct 31, 2024Updated last year
- This repo contains the code for 1D tokenizer and generator☆1,129Mar 20, 2025Updated last year