mit-han-lab / hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
☆405Updated 3 months ago
Alternatives and similar repositories for hart:
Users that are interested in hart are comparing it to the libraries listed below
- ☆221Updated 6 months ago
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆276Updated 3 weeks ago
- NOVA: Autoregressive Video Generation without Vector Quantization☆314Updated this week
- ☆253Updated 2 weeks ago
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆134Updated 2 months ago
- ☆182Updated last month
- [arXiv'25] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆205Updated this week
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆534Updated 5 months ago
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆226Updated 4 months ago
- Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆184Updated 2 weeks ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆393Updated 2 months ago
- Code repository for T2V-Turbo and T2V-Turbo-v2☆280Updated 2 months ago
- [NeurIPS 2024] Official implementation of "Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models"☆325Updated 3 months ago
- This repo contains the code for 1D tokenizer and generator☆645Updated this week
- 📚 Collection of awesome generation acceleration resources.☆93Updated this week
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆199Updated this week
- Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis☆276Updated last month
- ☆354Updated 2 months ago
- Scaling Diffusion Transformers with Mixture of Experts☆242Updated 4 months ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆395Updated 4 months ago
- Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation☆488Updated 4 months ago
- Official code for "ControlAR: Controllable Image Generation with Autoregressive Models"☆175Updated 3 weeks ago
- Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translati…☆180Updated this week
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆162Updated 2 months ago
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)☆476Updated 7 months ago
- This is a PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Framework for Cross-Modality Evolu…☆124Updated 2 weeks ago
- T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!☆375Updated 4 months ago
- GenEval: An object-focused framework for evaluating text-to-image alignment☆143Updated 5 months ago
- Let's finetune video generation models!☆357Updated this week
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆257Updated last month