MaureenZOU / detectron2-xyz
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
☆16Updated 2 years ago
Related projects: ⓘ
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆96Updated 4 months ago
- Stay tuned!☆11Updated 5 months ago
- ☆32Updated 8 months ago
- Simple script to parallelize download and extract files for SA-1B Dataset.☆24Updated last year
- ☆17Updated 5 months ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆36Updated last year
- Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆107Updated last month
- DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution☆34Updated 2 months ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆58Updated 2 weeks ago
- Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆52Updated 5 months ago
- ☆93Updated 3 months ago
- [NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fi…☆81Updated 10 months ago
- ☆27Updated 5 months ago
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆45Updated last month
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆22Updated last week
- ☆52Updated last year
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆72Updated 3 months ago
- Official repository of paper "Subobject-level Image Tokenization"☆58Updated 4 months ago
- Detectron2 Toolbox and Benchmark for V3Det☆15Updated 3 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆36Updated last month
- ☆100Updated last month
- 【ECCV2024】The official repo of Griffon series☆93Updated 2 months ago
- Official Pytorch codebase for Open-Vocabulary Instance Segmentation without Manual Mask Annotations [CVPR 2023]☆47Updated 9 months ago
- Visual Prompt Augmentation☆25Updated 8 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆47Updated 2 months ago
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆45Updated 2 weeks ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆49Updated last month
- ☆58Updated this week
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆22Updated 3 months ago