Alpha-VLLM / Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
☆534Updated 5 months ago
Alternatives and similar repositories for Lumina-mGPT:
Users that are interested in Lumina-mGPT are comparing it to the libraries listed below
- ☆354Updated 2 months ago
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆405Updated 3 months ago
- Multimodal Models in Real World☆427Updated 2 months ago
- Code repository for T2V-Turbo and T2V-Turbo-v2☆280Updated 2 months ago
- ☆396Updated last month
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆395Updated 4 months ago
- Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis☆276Updated last month
- Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation☆488Updated 4 months ago
- NOVA: Autoregressive Video Generation without Vector Quantization☆314Updated this week
- Memory-optimized training scripts for video models based on Diffusers☆730Updated this week
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)☆476Updated 7 months ago
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆695Updated this week
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆393Updated 2 months ago
- Let's finetune video generation models!☆357Updated this week
- ☆264Updated 5 months ago
- This repo contains the code for 1D tokenizer and generator☆645Updated this week
- [ICLR 2024] Code for FreeNoise based on VideoCrafter☆391Updated 6 months ago
- Official implementation of OneDiffusion paper☆583Updated last month
- SEED-Voken: A Series of Powerful Visual Tokenizers☆810Updated 2 weeks ago
- LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusi…☆445Updated 4 months ago
- [CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers☆557Updated 2 months ago
- Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis☆871Updated this week
- (NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis☆621Updated 3 months ago
- Stable Video Diffusion Training Code and Extensions.☆654Updated 5 months ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆596Updated 3 months ago
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆226Updated 4 months ago
- GenEval: An object-focused framework for evaluating text-to-image alignment☆143Updated 5 months ago
- Official PyTorch implementation of ECCV 2024 Paper: ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback.☆463Updated this week
- ☆221Updated 6 months ago
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆550Updated 3 months ago