mcahny / rovit
RO-ViT CVPR 2023 "Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers"
☆16Updated last year
Related projects: ⓘ
- [CVPR'24 Highlight] SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection☆38Updated last month
- [CVPR24] Official Implementation of GEM (Grounding Everything Module)☆72Updated 9 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆22Updated 3 months ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆54Updated last year
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆23Updated 3 months ago
- [CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detect…☆39Updated last month
- Official Pytorch codebase for Open-Vocabulary Instance Segmentation without Manual Mask Annotations [CVPR 2023]☆47Updated 9 months ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆98Updated last month
- [CBMI2024] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆17Updated 2 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆22Updated 3 months ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆35Updated last month
- [CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.☆20Updated 4 months ago
- ☆57Updated last year
- A large-scale benchmark for the evaluation of embeddings across a number of fine-grained and instance-level visual domains.☆13Updated 3 months ago
- ☆29Updated last year
- Compress conventional Vision-Language Pre-training data☆49Updated 11 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆96Updated 4 months ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆36Updated last year
- Code release for "Language-conditioned Detection Transformer"☆82Updated 3 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆25Updated 7 months ago
- ☆36Updated 4 months ago
- ☆52Updated last year
- Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)☆41Updated 3 months ago
- Code for Label Propagation for Zero-shot Classification with Vision-Language Models (CVPR2024)☆31Updated last month
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆21Updated 11 months ago
- IFSeg: Image-free Semantic Segmentation via Vision-Language Model (CVPR 2023)☆80Updated last year
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆22Updated last week
- ☆17Updated 5 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling @ CVPR22☆42Updated last year