mit-han-lab / hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
☆324Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for hart
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆237Updated last month
- Scaling Diffusion Transformers with Mixture of Experts☆202Updated 2 months ago
- ☆189Updated 3 months ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆495Updated 2 months ago
- T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!☆359Updated 2 months ago
- Code repository for T2V-Turbo and T2V-Turbo-v2☆249Updated 3 weeks ago
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)☆442Updated 5 months ago
- I'm back ! Related Sources of Meissonic developed by Community☆202Updated this week
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆211Updated 2 months ago
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆160Updated 3 weeks ago
- Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024☆317Updated last month
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆385Updated this week
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆240Updated last week
- SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models☆142Updated this week
- ☆343Updated 3 weeks ago
- Let's finetune video generation models!☆186Updated this week
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆366Updated 2 months ago
- GenEval: An object-focused framework for evaluating text-to-image alignment☆116Updated 3 months ago
- AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more samp…☆241Updated last week
- Open-MAGVIT2: Democratizing Autoregressive Visual Generation☆690Updated last month
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆79Updated last week
- A one-stop library to standardize the inference and evaluation of all the conditional image generation models. (ICLR 2024)☆149Updated last month
- Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation☆454Updated last month
- This repo contains the code for 1D tokenizer and generator☆534Updated this week
- Official implementation of "Controlling Text-to-Image Diffusion by Orthogonal Finetuning".☆279Updated 3 weeks ago
- [Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation☆208Updated this week
- 🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)☆154Updated 7 months ago
- Code for "Diffusion Model Alignment Using Direct Preference Optimization"☆261Updated 10 months ago
- Video-Infinity generates long videos quickly using multiple GPUs without extra training.☆163Updated 3 months ago
- Official code for "ControlAR: Controllable Image Generation with Autoregressive Models"☆114Updated last week