OptimalScale/DetGPT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OptimalScale/DetGPT)

OptimalScale / DetGPT

☆785

Alternatives and similar repositories for DetGPT

Users that are interested in DetGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pipilurj / ROBOT
View on GitHub
☆27Apr 11, 2023Updated 3 years ago
microsoft / GLIP
View on GitHub
Grounded Language-Image Pre-training
☆2,604Jan 24, 2024Updated 2 years ago
jshilong / GPT4RoI
View on GitHub
(ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
☆556Jun 3, 2025Updated last year
OpenGVLab / VisionLLM
View on GitHub
VisionLLM Series
☆1,151Feb 27, 2025Updated last year
IDEA-Research / GroundingDINO
View on GitHub
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
☆10,394Aug 12, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
shikras / shikra
View on GitHub
☆814Jul 8, 2024Updated 2 years ago
IDEA-Research / OpenSeeD
View on GitHub
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
☆762Jan 22, 2024Updated 2 years ago
microsoft / X-Decoder
View on GitHub
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
☆1,346Oct 5, 2023Updated 2 years ago
UX-Decoder / Segment-Everything-Everywhere-All-At-Once
View on GitHub
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
☆4,794Aug 19, 2024Updated last year
JIA-Lab-research / LISA
View on GitHub
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
☆2,660Feb 16, 2025Updated last year
X-PLUG / mPLUG-Owl
View on GitHub
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
☆2,536Apr 2, 2025Updated last year
IDEA-Research / Grounded-Segment-Anything
View on GitHub
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …
☆17,666Sep 5, 2024Updated last year
open-mmlab / Multimodal-GPT
View on GitHub
Multimodal-GPT
☆1,512Jun 4, 2023Updated 3 years ago
salesforce / LAVIS
View on GitHub
LAVIS - A One-stop Library for Language-Vision Intelligence
☆11,250Jun 2, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OpenGVLab / all-seeing
View on GitHub
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of …
☆507Aug 9, 2024Updated last year
pipilurj / MLLM-protector
View on GitHub
The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"
☆46Apr 21, 2024Updated 2 years ago
pipilurj / DynaFed
View on GitHub
☆50Apr 1, 2023Updated 3 years ago
OpenGVLab / Ask-Anything
View on GitHub
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
☆3,343Jan 18, 2025Updated last year
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,923Aug 12, 2024Updated last year
yuhangzang / ContextDET
View on GitHub
Contextual Object Detection with Multimodal Large Language Models
☆261Oct 14, 2024Updated last year
facebookresearch / ImageBind
View on GitHub
ImageBind One Embedding Space to Bind Them All
☆9,056Nov 21, 2025Updated 7 months ago
OpenGVLab / InternGPT
View on GitHub
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBin…
☆3,202Aug 20, 2024Updated last year
baaivision / Painter
View on GitHub
Painter & SegGPT Series: Vision Foundation Models from BAAI
☆2,593Dec 6, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
xinyu1205 / recognize-anything
View on GitHub
Open-source and strong foundation image recognition models.
☆3,688Feb 18, 2025Updated last year
microsoft / RegionCLIP
View on GitHub
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
☆817Mar 20, 2024Updated 2 years ago
W-Ted / UDC-NeRF
View on GitHub
Official code for ICCV2023 paper: Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis
☆34Dec 27, 2023Updated 2 years ago
UX-Decoder / Semantic-SAM
View on GitHub
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
☆2,848Jul 10, 2025Updated last year
baaivision / Emu
View on GitHub
Emu Series: Generative Multimodal Models from BAAI
☆1,776Jan 12, 2026Updated 6 months ago
baaivision / EVA
View on GitHub
EVA Series: Visual Representation Fantasies from BAAI
☆2,684Aug 1, 2024Updated last year
AILab-CVC / GPT4Tools
View on GitHub
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the u…
☆771Dec 19, 2023Updated 2 years ago
yxuansu / PandaGPT
View on GitHub
[TLLM'23] PandaGPT: One Model To Instruction-Follow Them All
☆862Jun 1, 2023Updated 3 years ago
Meituan-AutoML / Lenna
View on GitHub
☆87Feb 5, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Saiyan-World / grounded-segment-any-parts
View on GitHub
Grounded Segment Anything: From Objects to Parts
☆416May 19, 2023Updated 3 years ago
fundamentalvision / Uni-Perceiver
View on GitHub
☆291Aug 14, 2025Updated 11 months ago
Vision-CAIR / MiniGPT-4
View on GitHub
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
☆25,662Sep 2, 2024Updated last year
luogen1996 / LaVIN
View on GitHub
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
☆523Jan 27, 2024Updated 2 years ago
UCSB-AI / MiniGPT-5
View on GitHub
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
☆867May 8, 2025Updated last year
IDEA-Research / MaskDINO
View on GitHub
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segme…
☆1,541Dec 20, 2023Updated 2 years ago
facebookresearch / Detic
View on GitHub
Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
☆2,007Mar 21, 2024Updated 2 years ago