benbergner / cropr
A token pruning method that accelerates ViTs for various tasks while maintaining high performance.
☆12Updated 4 months ago
Alternatives and similar repositories for cropr:
Users that are interested in cropr are comparing it to the libraries listed below
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆37Updated 7 months ago
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆38Updated last year
- 🔥 🔥 [WACV2024] Mini but Mighty: Finetuning ViTs with Mini Adapters☆20Updated 10 months ago
- An Enhanced CLIP Framework for Learning with Synthetic Captions☆28Updated 3 weeks ago
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆42Updated 2 weeks ago
- This is the official code for paper: Token Summarisation for Efficient Vision Transformers via Graph-based Token Propagation☆27Updated last year
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆18Updated 4 months ago
- ☆41Updated 6 months ago
- [CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong C…☆25Updated 3 years ago
- Lightweight Transformer for Multi-modal Tasks☆16Updated 2 years ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Updated last year
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆18Updated 3 months ago
- ☆23Updated 2 years ago
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆34Updated 3 weeks ago
- [NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization☆30Updated 7 months ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆52Updated 6 months ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆17Updated 3 weeks ago
- Code for CVPR2025 "MMRL: Multi-Modal Representation Learning for Vision-Language Models".☆31Updated 3 weeks ago
- ☆21Updated 9 months ago
- ☆22Updated last year
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆17Updated 6 months ago
- Official pytorch implementation of NeurIPS 2022 paper, TokenMixup☆48Updated 2 years ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆33Updated last year
- Code for Learned Thresholds Token Merging and Pruning for Vision Transformers (LTMP). A technique to reduce the size of Vision Transforme…☆16Updated 5 months ago
- ☆12Updated last year
- ☆13Updated 5 months ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆39Updated last year
- Collect papers about Mamba (a selective state space model).☆14Updated 9 months ago
- [ICLR 2023] “ Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations”, Ziyu Jian…☆24Updated 2 years ago
- ☆31Updated last year