ArmenJeddi / saint
a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity
☆17Updated 3 weeks ago
Alternatives and similar repositories for saint:
Users that are interested in saint are comparing it to the libraries listed below
- ☆36Updated 9 months ago
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆30Updated last year
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆46Updated last month
- Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT …☆35Updated last year
- Adapting LLaMA Decoder to Vision Transformer☆28Updated 11 months ago
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆16Updated last month
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆38Updated 3 months ago
- GIFT: Generative Interpretable Fine-Tuning☆20Updated 6 months ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆30Updated last week
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"☆44Updated 3 months ago
- Official implementation for FlexAttention for Efficient High-Resolution Vision-Language Models☆39Updated 3 months ago
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆49Updated 11 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆44Updated last year
- CLIP-MoE: Mixture of Experts for CLIP☆31Updated 6 months ago
- Official implementation of NeurIPS 2024 "Visual Fourier Prompt Tuning"☆26Updated 3 months ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆18Updated 2 months ago
- ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2☆63Updated 5 months ago
- [ECCV 2024] Isomorphic Pruning for Vision Models☆66Updated 9 months ago
- Collect papers about Mamba (a selective state space model).☆14Updated 8 months ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆29Updated 6 months ago
- [ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference☆39Updated 10 months ago
- [CVPR 2025] Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training☆35Updated last month
- ☆26Updated 10 months ago
- [ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging tech…☆94Updated last year
- ☆12Updated last month
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆57Updated 7 months ago
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆74Updated 10 months ago
- [ECCV 2024 Workshop Best Paper Award] Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion☆31Updated 6 months ago
- [CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models☆13Updated 3 weeks ago
- [ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.☆69Updated last year