mcahny / rovit
RO-ViT CVPR 2023 "Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers"
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for rovit
- Code release for "Language-conditioned Detection Transformer"☆85Updated 5 months ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆54Updated last year
- [CVPR24] Official Implementation of GEM (Grounding Everything Module)☆86Updated last month
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆108Updated 3 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆24Updated 9 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆24Updated 5 months ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆49Updated 3 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆37Updated 11 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆26Updated 5 months ago
- ☆22Updated 2 weeks ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆37Updated last year
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated 2 years ago
- [ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation☆35Updated last month
- Official Pytorch codebase for Open-Vocabulary Instance Segmentation without Manual Mask Annotations [CVPR 2023]☆47Updated last year
- ☆29Updated last year
- OVAD: Open-vocabulary Attribute Detection code☆28Updated last year
- Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision☆24Updated last month
- Scene Graph Generate Zero Shot☆18Updated last year
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆50Updated 4 months ago
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Updated last year
- ☆57Updated last year
- ☆52Updated last year
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆31Updated 2 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆22Updated 5 months ago
- [CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detect…☆45Updated last month
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆52Updated 3 months ago
- The official implementation for Candidate Set Re-ranking for Composed Image Retrieval (TMLR) 01/2024☆13Updated 9 months ago
- (ICCV 2023) MasQCLIP for Open-Vocabulary Universal Image Segmentation☆34Updated last year
- Official implementation of TCL (CVPR 2023)☆109Updated last year