Ziwei-Zheng / LVLM-Stethoscope
A library of visualization tools for the interpretability and hallucination analysis of large vision-language models (LVLMs).
☆20Updated last month
Related projects ⓘ
Alternatives and complementary repositories for LVLM-Stethoscope
- [ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging tech…☆89Updated last year
- [NeurIPS'22] This is an official implementation for "Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning".☆173Updated last year
- [ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.☆64Updated last year
- Official implementation of Dynamic Perceiver☆41Updated last year
- [ICCV 2023 & AAAI 2023] Binary Adapters & FacT, [Tech report] Convpass☆171Updated last year
- [CVPR-22] This is the official implementation of the paper "Adavit: Adaptive vision transformers for efficient image recognition".☆49Updated 2 years ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated last month
- ImageNet-1K data download, processing for using as a dataset☆65Updated last year
- Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023☆76Updated 9 months ago
- [ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation☆29Updated last month
- ☆89Updated last year
- Project Page for "Multi-Task Dense Prediction via Mixture of Low-Rank Experts"☆56Updated last month
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekin…☆55Updated last month
- [CVPR 2023] Diversity-Aware Meta Visual Prompting☆78Updated 11 months ago
- [IEEE TIP] Fine-grained Recognition with Learnable Semantic Data Augmentation☆27Updated 11 months ago
- Test-time Prompt Tuning (TPT) for zero-shot generalization in vision-language models (NeurIPS 2022))☆145Updated 2 years ago
- [CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"☆75Updated this week
- ☆50Updated 2 years ago
- [arXiv] Cross-Modal Adapter for Text-Video Retrieval☆55Updated 2 years ago
- [NeurIPS 2023] Rank-DETR for High Quality Object Detection☆87Updated last year
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆22Updated 7 months ago
- ☆109Updated 5 months ago
- A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability☆35Updated 2 weeks ago
- ☆82Updated last year
- Official Pytorch implementation of "E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning". (ICCV2023)☆67Updated 10 months ago
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆66Updated last year
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆72Updated 2 months ago
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆15Updated 4 months ago
- Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer☆69Updated 2 years ago
- Implementation of HAT https://arxiv.org/pdf/2204.00993☆47Updated 8 months ago