NKU-MetautoAI / awesome-large-vision-language-modelsLinks
Advances in recent large vision language models (LVLMs)
☆14Updated 9 months ago
Alternatives and similar repositories for awesome-large-vision-language-models
Users that are interested in awesome-large-vision-language-models are comparing it to the libraries listed below
Sorting:
- [ECCV 2024 Workshop Best Paper Award] Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion☆34Updated 8 months ago
- [CVPR2025] Official implementation of RAM☆17Updated 3 months ago
- [CVPR 2024] Depth-aware Test-Time Training for Zero-shot Video Object Segmentation☆26Updated last month
- [IJCV 2024]☆16Updated 7 months ago
- ☆29Updated last year
- [ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".☆23Updated 8 months ago
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆40Updated 5 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Updated last year
- Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dep…☆13Updated last month
- Sambor: Boosting Segment Anything Model Towards Open-Vocabulary Learning☆30Updated last year
- [ECCV'24 Oral] Anytime Continual Learning for Open Vocabulary Classification☆20Updated 8 months ago
- (ICCV 2023) MasQCLIP for Open-Vocabulary Universal Image Segmentation☆38Updated last year
- [NIPS24] Official Implementation of Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation☆18Updated 7 months ago
- ☆35Updated last year
- ☆36Updated 2 years ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆46Updated 2 months ago
- [NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.☆20Updated 3 months ago
- ☆15Updated 7 months ago
- ICLR 2023 and ICML 2023 paper☆20Updated 9 months ago
- Official Implementation of "Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning" in AAAI2024.☆13Updated last year
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆23Updated 3 months ago
- ☆57Updated last month
- The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (arXiv 2025)☆30Updated 3 weeks ago
- [CVPR 2025] EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance☆19Updated 2 months ago
- Segment Anything with Deictic Prompting☆26Updated last month
- CAD - Memory Efficient Convolutional Adapter for Segment Anything☆12Updated 8 months ago
- Vision and Language Reference Prompt into SAM for Few-shot Segmentation☆17Updated 2 months ago
- [ECCV24] The official code repository for paper "Training-Free Model Merging for Multi-target Domain Adaptation".☆15Updated 8 months ago
- [CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training☆21Updated 2 months ago
- Implementation of ''VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation''☆13Updated 2 years ago