mit-han-lab / hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
☆561Updated 6 months ago
Alternatives and similar repositories for hart:
Users that are interested in hart are comparing it to the libraries listed below
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆468Updated 2 weeks ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆584Updated 2 weeks ago
- Official implementation of OneDiffusion paper (CVPR 2025)☆623Updated 4 months ago
- (NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis☆745Updated last month
- VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE☆314Updated 3 months ago
- Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆204Updated last week
- This is a repo to track the latest autoregressive visual generation papers.☆257Updated this week
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆536Updated this week
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆300Updated 3 months ago
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆930Updated last month
- Scaling Diffusion Transformers with Mixture of Experts☆311Updated 7 months ago
- T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!☆394Updated last month
- (CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models☆186Updated this week
- SEED-Voken: A Series of Powerful Visual Tokenizers☆865Updated 2 months ago
- Official Implementation of Video-T1: Test-Time Scaling for Video Generation☆246Updated 2 weeks ago
- GenEval: An object-focused framework for evaluating text-to-image alignment☆232Updated last month
- [CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis☆1,180Updated last month
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆266Updated last month
- [ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image…☆305Updated last month
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆411Updated 5 months ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆431Updated 7 months ago
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)☆497Updated 10 months ago
- [CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆695Updated last week
- Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch☆338Updated 3 months ago
- Official PyTorch Implementation of "History-Guided Video Diffusion"☆264Updated last month
- This repo contains the code for 1D tokenizer and generator☆832Updated last month
- Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation☆528Updated 7 months ago
- ☆170Updated this week
- Memory-optimized training library for diffusion models☆1,048Updated last week
- Official implementation of Inductive Moment Matching☆448Updated last month