FudanDISC / Awesome-Multimodal-Large-Language-ModelsLinks
Papers of "A Survey on Multimodal LLMs from the Perspective of Input-Output Space Extension"
☆16Updated last week
Alternatives and similar repositories for Awesome-Multimodal-Large-Language-Models
Users that are interested in Awesome-Multimodal-Large-Language-Models are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆53Updated 6 months ago
- CLIMB-ReID: A Hybrid CLIP-Mamba Framework for Person Re-Identification(AAAI2025)☆43Updated 2 months ago
- Vision-Language based Visual Object Tracking☆27Updated 4 months ago
- [ICCV2025] ModPrompt: Visual Modality Prompt for Adapting Vision-Language Object Detectors☆22Updated 7 months ago
- Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023☆56Updated 2 years ago
- ☆14Updated last year
- [CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"☆109Updated 2 months ago
- [ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.☆75Updated 2 years ago
- [COLING'25] HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding☆44Updated last year
- This repository contains the implementation of the method described in our paper, "Divide and Conquer: Isolating Normal-Abnormal Attribut…☆10Updated last year
- [PRCV-2023, IEEE TMM-2025] Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion based Classification☆12Updated last month
- CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models☆90Updated last year
- Official pytorch implementation of ZiRa, a method for incremental vision language object detection (IVLOD),which has been accepted by Neu…☆29Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆108Updated 8 months ago
- IPO: Interpretable Prompt Optimization for Vision-Language Models(NeurIPS 2024)☆15Updated 11 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆69Updated last year
- ECCV24 "ReMamber: Referring Image Segmentation with Mamba Twister" official repository.☆44Updated last year
- ☆16Updated 10 months ago
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆27Updated last year
- [NeurIPS 2023] Meta-Adapter☆48Updated 2 years ago
- Official implementation of "What does CLIP know about a red circle? Visual Prompt Engineering for VLMs", ICCV 2023☆11Updated 2 years ago
- ☆34Updated 2 years ago
- Official Pytorch implementation of "E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning". (ICCV2023)☆72Updated 2 years ago
- Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples☆40Updated last year
- ☆14Updated 2 years ago
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆49Updated last year
- [CVPR2025] Rethinking Query-based Transformer for Continual Image Segmentation☆41Updated 6 months ago
- Open-vocabulary Semantic Segmentation☆33Updated last year
- [CVPR 2024] Official Repository for "Efficient Test-Time Adaptation of Vision-Language Models"☆114Updated last year
- Official Implementation of Towards Open Vocabulary Video Semantic Segmentation☆14Updated 11 months ago