yuhangzang/ContextDET

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yuhangzang/ContextDET)

yuhangzang / ContextDET

Contextual Object Detection with Multimodal Large Language Models

☆261

Alternatives and similar repositories for ContextDET

Users that are interested in ContextDET are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jianzongwu / betrayed-by-captions
View on GitHub
(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
☆48Jul 18, 2024Updated last year
jshilong / GPT4RoI
View on GitHub
(ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
☆556Jun 3, 2025Updated last year
OpenGVLab / VisionLLM
View on GitHub
VisionLLM Series
☆1,149Feb 27, 2025Updated last year
microsoft / GLIP
View on GitHub
Grounded Language-Image Pre-training
☆2,605Jan 24, 2024Updated 2 years ago
wusize / ovdet
View on GitHub
[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection
☆187Oct 25, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
shikras / shikra
View on GitHub
☆813Jul 8, 2024Updated 2 years ago
OptimalScale / DetGPT
View on GitHub
☆785Aug 7, 2024Updated last year
OpenGVLab / all-seeing
View on GitHub
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of …
☆508Aug 9, 2024Updated last year
mightyzau / RegionBLIP
View on GitHub
☆59Aug 7, 2023Updated 2 years ago
weivision / Correlational-Image-Modeling
View on GitHub
☆31Dec 9, 2023Updated 2 years ago
BAAI-DCAI / Visual-Instruction-Tuning
View on GitHub
SVIT: Scaling up Visual Instruction Tuning
☆167Jun 20, 2024Updated 2 years ago
yuhangzang / OV-DETR
View on GitHub
[Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)
☆240Aug 3, 2022Updated 3 years ago
PVIT-official / PVIT
View on GitHub
Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
☆37Sep 19, 2023Updated 2 years ago
amazon-science / prompt-pretraining
View on GitHub
Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"
☆259May 3, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Luodian / GenBench
View on GitHub
Benchmarking and Analyzing Generative Data for Visual Recognition
☆26Jul 25, 2023Updated 2 years ago
JialianW / GRiT
View on GitHub
GRiT: A Generative Region-to-text Transformer for Object Understanding (ECCV2024)
☆341Jan 8, 2024Updated 2 years ago
clin1223 / VLDet
View on GitHub
[ICLR 2023] PyTorch implementation of VLDet （https://arxiv.org/abs/2211.14843）
☆192Mar 22, 2024Updated 2 years ago
JIA-Lab-research / LISA
View on GitHub
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
☆2,657Feb 16, 2025Updated last year
chongzhou96 / MaskCLIP
View on GitHub
Official PyTorch implementation of "Extract Free Dense Labels from CLIP" (ECCV 22 Oral)
☆479Sep 19, 2022Updated 3 years ago
lancopku / clip-openness
View on GitHub
[ACL 2023] Delving into the Openness of CLIP
☆24Jan 11, 2023Updated 3 years ago
ucas-vg / Sambor
View on GitHub
Sambor: Boosting Segment Anything Model Towards Open-Vocabulary Learning
☆32Dec 7, 2023Updated 2 years ago
xiaofeng94 / VL-PLM
View on GitHub
Exploiting unlabeled data with vision and language models for object detection, ECCV 2022
☆97Jan 16, 2024Updated 2 years ago
penghao-wu / vstar
View on GitHub
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
☆705Jan 7, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
microsoft / RegionCLIP
View on GitHub
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
☆816Mar 20, 2024Updated 2 years ago
FoundationVision / GenerateU
View on GitHub
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
☆196Mar 29, 2025Updated last year
ZhangYuanhan-AI / visual_prompt_retrieval
View on GitHub
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
☆182Mar 4, 2024Updated 2 years ago
FocalNet / FocalNet-DINO
View on GitHub
This repo contains the code and configuration files for reproducing object detection results of FocalNets with DINO
☆68Mar 10, 2023Updated 3 years ago
hjbahng / visual_prompting
View on GitHub
Exploring Visual Prompts for Adapting Large-Scale Models
☆291Jun 6, 2022Updated 4 years ago
YukunLi99 / AdaptSAM
View on GitHub
☆22Jun 30, 2023Updated 3 years ago
NVlabs / ODISE
View on GitHub
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
☆947Jul 6, 2024Updated 2 years ago
bytedance / DQ-Det
View on GitHub
Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
☆38Sep 12, 2023Updated 2 years ago
MendelXu / zsseg.baseline
View on GitHub
Open-vocabulary Semantic Segmentation
☆185Mar 28, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
wusize / OpenUni
View on GitHub
☆188Jun 27, 2025Updated last year
tgxs002 / CORA
View on GitHub
A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023
☆201Apr 16, 2023Updated 3 years ago
mbzuai-oryx / groundingLMM
View on GitHub
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…
☆963Aug 5, 2025Updated 11 months ago
FuxiaoLiu / LRV-Instruction
View on GitHub
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆298Mar 13, 2024Updated 2 years ago
ImKeTT / ZeroGen
View on GitHub
[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation
☆14Oct 7, 2023Updated 2 years ago
sunsmarterjie / ChatterBox
View on GitHub
[AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues
☆61May 2, 2025Updated last year
prannaykaul / mm-ovod
View on GitHub
Official repo for our ICML 23 paper: "Multi-Modal Classifiers for Open-Vocabulary Object Detection"
☆95Jun 22, 2023Updated 3 years ago