Shenyi-Z / ToCa
Accelerating Diffusion Transformers with Token-wise Feature Caching
β24Updated 2 weeks ago
Related projects β
Alternatives and complementary repositories for ToCa
- π Collection of awesome generation acceleration resources.β43Updated 2 weeks ago
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Cachingβ75Updated 4 months ago
- β102Updated 2 months ago
- PyTorch code for Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersβ34Updated 2 months ago
- FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.β29Updated 4 months ago
- This is a repo to track the latest autoregressive visual generation papers.β50Updated this week
- Adaptive Caching for Faster Video Generation with Diffusion Transformersβ98Updated 2 weeks ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekinβ¦β56Updated last month
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Moβ¦β55Updated 3 months ago
- π₯ImageFolder: Autoregressive Image Generation with Folded Tokensβ59Updated last week
- Paper survey of efficient computation for large scale models.β30Updated 6 months ago
- Unified Multi-modal IAA Baseline and Benchmarkβ70Updated last month
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captionsβ106Updated 3 weeks ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesisβ84Updated 4 months ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Modelsβ100Updated 6 months ago
- [ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging techβ¦β89Updated last year
- a collection of awesome autoregressive visual generation modelsβ43Updated this week
- ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ34Updated 3 months ago
- official impelmentation of Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Inputβ54Updated 2 months ago
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β50Updated 5 months ago
- Adapting LLaMA Decoder to Vision Transformerβ27Updated 6 months ago
- β41Updated 8 months ago
- β105Updated 3 months ago
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identificationβ11Updated 3 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β125Updated last year
- β23Updated 5 months ago
- β16Updated last year
- [ECCV 2024] Isomorphic Pruning for Vision Modelsβ54Updated 4 months ago
- β109Updated 5 months ago
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representationsβ123Updated 5 months ago