HankYe / Once-for-BothLinks

[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

☆15

Alternatives and similar repositories for Once-for-Both

Users that are interested in Once-for-Both are comparing it to the libraries listed below

Sorting:

sdc17 / UPop
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
☆105Updated 11 months ago
ChangyuanWang17 / QVLM
[NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.
☆91Updated 10 months ago
OpenGVLab / DiffRate
[ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging tech…
☆101Updated 2 years ago
thu-nics / FrameFusion
[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"
☆68Updated this week
Gumpest / SparseVLMs
[ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".
☆193Updated 5 months ago
thu-nics / MBQ
The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"
☆66Updated 8 months ago
NUS-HPC-AI-Lab / Dynamic-Tuning
The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"
☆51Updated 11 months ago
VainF / Isomorphic-Pruning
[ECCV 2024] Isomorphic Pruning for Vision Models
☆78Updated last year
Adlith / MoE-Jetpack
[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
☆133Updated last year
Aaronhuang-778 / Mixture-Compressor-MoE
[ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains More
☆62Updated 9 months ago
Cooperx521 / PyramidDrop
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
☆134Updated 8 months ago
chengtao-lv / PTQ4SAM
[CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything
☆82Updated last year
Hsu1023 / DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆176Updated last year
YanjingLi0202 / Q-ViT
The official implementation of the NeurIPS 2022 paper Q-ViT.
☆101Updated 2 years ago
ModelTC / TFMQ-DM
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for…
☆109Updated 2 months ago
ThisisBillhe / torch_quantizer
torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.
☆22Updated last year
ziplab / SN-Net
[CVPR 2023 Highlight] This is the official implementation of "Stitchable Neural Networks".
☆249Updated 2 years ago
NUS-HPC-AI-Lab / R-MeeTo
Give us minutes, we give back a faster Mamba. The official implementation of "Faster Vision Mamba is Rebuilt in Minutes via Merged Token …
☆40Updated 11 months ago
42Shawn / LLaVA-PruMerge
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
☆154Updated 2 months ago
JinXins / Awesome-Token-Merge-for-MLLMs
A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.
☆75Updated last month
KD-TAO / DyCoke
[CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
☆90Updated last week
hustvl / PD-Quant
[CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
☆60Updated 2 years ago
sdc17 / CrossGET
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
☆34Updated 11 months ago
Theia-4869 / FasterVLM
Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.
☆97Updated 5 months ago
DZY122 / DiTAS
DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing (WACV 2025)
☆12Updated last year
double125 / MADTP
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
☆49Updated last year
Efficient-ML / Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…
☆201Updated 9 months ago
pkunlp-icler / FastV
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Langua…
☆519Updated 10 months ago
thu-nics / ViDiT-Q
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
☆138Updated 8 months ago
JingyangXiang / DFRot
[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎：https://zhuanlan.zhihu.c…
☆28Updated 8 months ago