xushilin1/dst-det

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xushilin1/dst-det)

xushilin1 / dst-det

[TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det

☆35

Alternatives and similar repositories for dst-det

Users that are interested in dst-det are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VinAIResearch / LP-OVOD
View on GitHub
LP-OVOD: Open-Vocabulary Object Detection by Linear Probing (WACV 2024)
☆30Jul 23, 2024Updated 2 years ago
SkyworkAI / DAQ-VS
View on GitHub
Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]
☆15Jul 11, 2024Updated 2 years ago
marinero4972 / CyberV
View on GitHub
☆20Jun 10, 2025Updated last year
wusize / CLIPSelf
View on GitHub
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
☆207Feb 5, 2024Updated 2 years ago
mala-lab / SIC-CADS
View on GitHub
Code Implementation of "Simple Image-level Classification Improves Open-vocabulary Object Detection" (AAAI'24)
☆30Jan 12, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Shi-qingyu / DreamRelation
View on GitHub
[CVPR 2025] DreamRelation: Bridging Customization and Relation Generation
☆19Dec 17, 2025Updated 7 months ago
jianzongwu / Does-Hearing-Help-Seeing
View on GitHub
☆19Dec 3, 2025Updated 7 months ago
lxtGH / Panoptic-PartFormer
View on GitHub
[ECCV-2022] The First Unified End-to-End System for Panoptic Part Segmentation
☆63Sep 2, 2024Updated last year
FoundationVision / GenerateU
View on GitHub
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
☆196Mar 29, 2025Updated last year
jianzongwu / betrayed-by-captions
View on GitHub
(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
☆48Jul 18, 2024Updated 2 years ago
xushilin1 / RMP-SAM
View on GitHub
[ICLR 2025 oral] RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
☆271Apr 11, 2025Updated last year
Shi-qingyu / RecTok
View on GitHub
[CVPR 26] Official PyTorch Implementation of RecTok
☆23Feb 24, 2026Updated 5 months ago
lxtGH / Video-K-Net
View on GitHub
[CVPR-2022 (oral)]-Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
☆157Aug 19, 2023Updated 2 years ago
Haochen-Wang409 / TreeVGR
View on GitHub
[ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
☆91Jan 26, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CVMI-Lab / CoDet
View on GitHub
(NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
☆123Apr 26, 2024Updated 2 years ago
xiaofeng94 / SAS-Det
View on GitHub
Taming Self-Training for Open-Vocabulary Object Detection, CVPR 2024
☆22Dec 30, 2023Updated 2 years ago
wusize / ovdet
View on GitHub
[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection
☆187Oct 25, 2023Updated 2 years ago
jianzongwu / MotionBooth
View on GitHub
[NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"
☆138Oct 8, 2024Updated last year
jianzongwu / robust-ref-seg
View on GitHub
(TIP 2024) Towards Robust Referring Image Segmentation
☆40Mar 2, 2024Updated 2 years ago
HarborYuan / PolyphonicFormer
View on GitHub
[ECCV 2022] 🎵PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation
☆57Dec 22, 2022Updated 3 years ago
clovaai / ProxyDet
View on GitHub
Official implementation of the paper "ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection…
☆26Feb 13, 2024Updated 2 years ago
wang-chaoyang / SemFlow
View on GitHub
[NeurIPS 2024] SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow
☆46Dec 1, 2024Updated last year
wusize / Harmon
View on GitHub
[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
☆192May 21, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ChocoWu / SeTok
View on GitHub
Codes for ICLR 2025 Paper: Towards Semantic Equivalence of Tokenization in Multimodal LLM
☆81Apr 19, 2025Updated last year
alirezazareian / ovr-cnn
View on GitHub
A new framework for open-vocabulary object detection, based on maskrcnn-benchmark
☆249Feb 11, 2023Updated 3 years ago
LiChenyang-Github / LongShortNet
View on GitHub
LongShortNet for Streaming Perception task.
☆13Aug 27, 2023Updated 2 years ago
fanglaosi / Point-In-Context
View on GitHub
[NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding
☆74Mar 18, 2026Updated 4 months ago
marinero4972 / VideoZeroBench
View on GitHub
Official implementation of "VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification"
☆21May 7, 2026Updated 2 months ago
ucas-vg / Sambor
View on GitHub
Sambor: Boosting Segment Anything Model Towards Open-Vocabulary Learning
☆32Dec 7, 2023Updated 2 years ago
PhoenixZ810 / MG-LLaVA
View on GitHub
Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).
☆160Sep 27, 2024Updated last year
jianzongwu / Awesome-Open-Vocabulary
View on GitHub
(TPAMI 2024) A Survey on Open Vocabulary Learning
☆999May 12, 2026Updated 2 months ago
SAIC-Vision / WS-3D-Lane
View on GitHub
[ICRA 2023] WS-3D-Lane: Weakly Supervised 3D Lane Detection with 2D Lane Labels
☆16Apr 27, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
zhang-tao-whu / DVIS_Plus
View on GitHub
☆141Jul 4, 2024Updated 2 years ago
Xuan-World / Mamba-YOLO-World
View on GitHub
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection
☆104Mar 12, 2025Updated last year
seermer / RTGen
View on GitHub
☆13Jul 30, 2024Updated last year
Haochen-Wang409 / Grasp-Any-Region
View on GitHub
[ICLR'26] Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
☆99Jan 26, 2026Updated 6 months ago
Ephemeral182 / Empirical-Study-of-GPT-4o-Image-Gen
View on GitHub
An Empirical Study of GPT-4o Image Generation Capabilities
☆29Apr 16, 2025Updated last year
M-E-AGI-Lab / Muddit
View on GitHub
[ICLR 2026] Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusio…
☆119Apr 13, 2026Updated 3 months ago
chen-xin-94 / DART
View on GitHub
☆23Jul 9, 2026Updated 2 weeks ago