xuyang-liu16 / Awesome-Token-Reduction-for-Model-Compression
π Collection of token reduction for model compression resources.
β18Updated this week
Alternatives and similar repositories for Awesome-Token-Reduction-for-Model-Compression:
Users that are interested in Awesome-Token-Reduction-for-Model-Compression are comparing it to the libraries listed below
- π Collection of awesome generation acceleration resources.β84Updated this week
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekinβ¦β66Updated 2 months ago
- Code release for VTW (AAAI 2025)β27Updated last month
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Modelsβ109Updated 7 months ago
- The paper collections for the autoregressive models in vision.β350Updated 2 weeks ago
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"β43Updated last week
- Accelerating Diffusion Transformers with Token-wise Feature Cachingβ45Updated this week
- This is a repo to track the latest autoregressive visual generation papers.β96Updated last week
- β31Updated last month
- [EMNLP 2024 Findingsπ₯] Official implementation of "LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Infeβ¦β86Updated 2 months ago
- β91Updated 6 months ago
- β116Updated 6 months ago
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.β43Updated 3 weeks ago
- β25Updated 6 months ago
- β33Updated last week
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"β160Updated 3 months ago
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.β15Updated 3 weeks ago
- βοΈ Accelerating Vision Diffusion Transformers with Skip Branches.β58Updated 3 weeks ago
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Modelβ85Updated last year
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Cachingβ89Updated 5 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'β131Updated this week
- [NeurIPS 2024] Visual Perception by Large Language Modelβs Weightsβ33Updated 2 months ago
- π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".β219Updated last week
- β46Updated last week
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationβ194Updated 2 months ago
- A paper list of some recent works about Token Compress for Vit and VLMβ265Updated this week
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".β50Updated this week
- A collection of vision foundation models unifying understanding and generation.β32Updated last week
- β43Updated 3 weeks ago
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignmentβ56Updated 3 months ago