NUS-HPC-AI-Lab / Dynamic-Diffusion-Transformer
β34Updated last month
Related projects β
Alternatives and complementary repositories for Dynamic-Diffusion-Transformer
- This is a repo to track the latest autoregressive visual generation papers.β50Updated this week
- π₯ImageFolder: Autoregressive Image Generation with Folded Tokensβ57Updated last week
- π Collection of awesome generation acceleration resources.β43Updated 2 weeks ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Attenβ¦β29Updated last week
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"β36Updated last month
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"β76Updated last month
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"β34Updated 5 months ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspectiveβ41Updated 3 weeks ago
- [ICLR 2024] Official pytorch implementation of "Denoising Task Routing for Diffusion Models"β19Updated 9 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesisβ84Updated 4 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networksβ30Updated 5 months ago
- β26Updated 4 months ago
- Empowering Unified MLLM with Multi-granular Visual Generationβ106Updated last month
- MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Modelsβ55Updated 2 months ago
- Adapting LLaMA Decoder to Vision Transformerβ27Updated 6 months ago
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Cachingβ75Updated 4 months ago
- T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generationβ49Updated 2 months ago
- π₯ [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"β26Updated 5 months ago
- a collection of awesome autoregressive visual generation modelsβ43Updated this week
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generationβ31Updated 2 months ago
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".β45Updated 3 weeks ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token promptβ¦β27Updated last month
- Denoising Diffusion Step-aware Models (ICLR2024)β52Updated 9 months ago
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".β72Updated 2 months ago
- CAR: Controllable AutoRegressive Modeling for Visual Generationβ49Updated this week
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representationsβ123Updated 5 months ago
- Video Diffusion State Space Modelsβ19Updated 7 months ago
- [ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"β32Updated 4 months ago
- β23Updated 3 months ago