alexlioralexli / attention-transfer
☆18Updated 5 months ago
Alternatives and similar repositories for attention-transfer:
Users that are interested in attention-transfer are comparing it to the libraries listed below
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆81Updated last year
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆60Updated 11 months ago
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆38Updated last year
- Official Implementation of DiffCLIP: Differential Attention Meets CLIP☆26Updated last month
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆41Updated this week
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆33Updated last week
- The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".☆64Updated 2 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆39Updated 4 months ago
- Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023☆82Updated last year
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆49Updated 11 months ago
- Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"☆14Updated last year
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆57Updated 4 months ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆52Updated 5 months ago
- ☆34Updated 9 months ago
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model☆86Updated last year
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆74Updated 10 months ago
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆21Updated 7 months ago
- Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)☆56Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆34Updated last year
- ☆34Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆38Updated 4 months ago
- Adapting LLaMA Decoder to Vision Transformer☆28Updated 11 months ago
- ☆86Updated 2 years ago
- ☆23Updated 2 years ago
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆160Updated last year
- Official Implementation of "Read-only Prompt Optimization for Vision-Language Few-shot Learning", ICCV 2023☆53Updated last year
- [CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.☆27Updated 11 months ago
- [CVPR'23 & TPAMI'25] Hard Patches Mining for Masked Image Modeling☆93Updated last week
- [CVPR 2024] Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification☆29Updated last year
- AlignCLIP: Improving Cross-Modal Alignment in CLIP (ICLR 2025)☆35Updated last month