KishoreP1 / DetailCLIP
Detail-Oriented CLIP for Fine-Grained Tasks
☆35Updated last month
Related projects ⓘ
Alternatives and complementary repositories for DetailCLIP
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆61Updated last month
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆93Updated 2 months ago
- FreeDA: Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation (CVPR 2024)☆29Updated 2 months ago
- A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆34Updated 2 weeks ago
- This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…☆64Updated 5 months ago
- ☆52Updated 3 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆78Updated 8 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆33Updated 2 weeks ago
- [CVPR 2024] Official implementation of "Universal Segmentation at Arbitrary Granularity with Language Instruction"☆78Updated 8 months ago
- [ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation☆96Updated 9 months ago
- Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference☆131Updated last month
- CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation☆65Updated 3 months ago
- ☆33Updated last month
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆133Updated 3 months ago
- [ICML2024]The official implementation of SemiRES in PyTorch.☆19Updated 5 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆96Updated last week
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆49Updated 3 months ago
- ☆16Updated last year
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆56Updated 2 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated last month
- ☆12Updated 11 months ago
- Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".☆42Updated 2 months ago
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆21Updated last month
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆30Updated this week
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆40Updated last year
- [ECCV 2024 Best Paper Candidate] Implementation of "Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Vi…☆40Updated last month
- [IEEE TCSVT] Official Pytorch Implementation of CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation.☆35Updated 3 weeks ago
- Composed Video Retrieval☆46Updated 6 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆28Updated 8 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆106Updated 3 weeks ago