ShunqiM / PMLinks
☆12Updated 5 months ago
Alternatives and similar repositories for PM
Users that are interested in PM are comparing it to the libraries listed below
Sorting:
- The official implement of "Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models"☆17Updated 9 months ago
- [ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models☆86Updated 11 months ago
- ☆104Updated 5 months ago
- CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models☆89Updated last year
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆78Updated last month
- Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs☆21Updated 8 months ago
- [ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models☆56Updated last year
- [CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…☆62Updated 3 months ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆60Updated last year
- code for FineLIP☆38Updated last month
- [CVPR 2025] VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification☆49Updated 9 months ago
- [ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models☆122Updated 3 months ago
- [ICML'25] Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models☆21Updated 4 months ago
- ☆22Updated 8 months ago
- cliptrase☆47Updated last year
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆204Updated 6 months ago
- This is the official repository for paper: cross-modal information flow in multimodal large language models☆38Updated 8 months ago
- Project Page for "Multi-Task Dense Prediction via Mixture of Low-Rank Experts"☆87Updated 7 months ago
- CAD - Memory Efficient Convolutional Adapter for Segment Anything☆12Updated last year
- [ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆73Updated 3 months ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".☆47Updated 10 months ago
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆40Updated 4 months ago
- The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]☆48Updated 10 months ago
- [ECCV 2024] Soft Prompt Generation for Domain Generalization☆29Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Updated last year
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆56Updated 5 months ago
- Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)☆51Updated 3 months ago
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆165Updated last month
- [ECCV 2024] Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation☆33Updated 10 months ago
- ☆82Updated 9 months ago