mit-han-lab / hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
☆340Updated last month
Related projects ⓘ
Alternatives and complementary repositories for hart
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆240Updated last month
- ☆193Updated 4 months ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆500Updated 3 months ago
- Code repository for T2V-Turbo and T2V-Turbo-v2☆250Updated 3 weeks ago
- I'm back! Implementations of Meissonic developed by Community~If you feel it is helpful, plz consider giving a star❤️☆249Updated last week
- SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models☆327Updated this week
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆212Updated 3 months ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆389Updated last week
- Scaling Diffusion Transformers with Mixture of Experts☆207Updated 2 months ago
- ☆349Updated last month
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆91Updated 2 weeks ago
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆137Updated 3 weeks ago
- T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!☆361Updated 2 months ago
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆243Updated 2 weeks ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆371Updated 2 months ago
- This repo contains the code for 1D tokenizer and generator☆548Updated this week
- Open-MAGVIT2: Democratizing Autoregressive Visual Generation☆705Updated last month
- Let's finetune video generation models!☆236Updated this week
- Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024☆319Updated last month
- Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed☆406Updated this week
- Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation☆460Updated 2 months ago
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆161Updated last month
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representations☆122Updated 5 months ago
- [NeurIPS 2024] Official implementation of "Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models"☆307Updated last month
- GenEval: An object-focused framework for evaluating text-to-image alignment☆120Updated 3 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional image generation models. (ICLR 2024)☆150Updated last month
- AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more samp…☆242Updated 2 weeks ago
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)☆445Updated 5 months ago
- Video-Infinity generates long videos quickly using multiple GPUs without extra training.☆163Updated 3 months ago
- [CVPR 2024] | LAMP: Learn a Motion Pattern for Few-Shot Based Video Generation☆267Updated 6 months ago