alibaba / AICITY2024_Track2_AliOpenTrek_CityLLaVALinks

☆54

Alternatives and similar repositories for AICITY2024_Track2_AliOpenTrek_CityLLaVA

Users that are interested in AICITY2024_Track2_AliOpenTrek_CityLLaVA are comparing it to the libraries listed below

Sorting:

woven-visionai / wts-dataset
☆48Updated 5 months ago
om-ai-lab / GroundVLP
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
☆72Updated last year
callsys / DynRefer
[CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
☆55Updated 8 months ago
jinga-lala / DAMEX
Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…
☆23Updated last year
OpenGVLab / PIIP
[NeurIPS 2024 Spotlight ⭐️ & TPAMI 2025] Parameter-Inverted Image Pyramid Networks (PIIP)
☆105Updated 3 months ago
jefferyZhan / Griffon
Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1.
☆244Updated 3 months ago
SysCV / cascade-detr
[ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection
☆100Updated 2 years ago
lorebianchi98 / FG-OVD
[CVPR 2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detec…
☆61Updated 7 months ago
PKU-ICST-MIPL / DyFo_CVPR2025
☆96Updated 3 months ago
Meituan-AutoML / Lenna
☆86Updated last year
eternaldolphin / LaMI-DETR
[ECCV 2024] Official implementation of "LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction"
☆85Updated 7 months ago
343gltysprk / ovow
☆33Updated 2 months ago
rolsheng / MM-VUFM4DS
【IEEE T-IV】A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios
☆50Updated last year
Nathan-Li123 / SMOTer
[ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking
☆53Updated last year
Traffic-X / Open-TransMind
Official implementation of the CVPR paper Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent …
☆28Updated 2 years ago
AIVIETNAMResearch / AI-City-2024-Track2
AICITY2024 Track 2 - Code from AIO_ISC Team
☆37Updated last year
SY-Xuan / Pink
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
☆95Updated 10 months ago
salmank255 / ROAD_Waymo_Baseline
☆15Updated last year
CongHan0808 / DeOP
Open-vocabulary Semantic Segmentation
☆33Updated last year
CVMI-Lab / CoDet
(NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
☆121Updated last year
frh23333 / mepu-owod
Code Implementation of "Unsupervised Recognition of Unknown Objects for Open-World Object Detection"
☆31Updated 2 years ago
Atten4Vis / GroupDETR
[ICCV 2023] Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
☆43Updated 2 years ago
dvlab-research / VisionReasoner
Vision Manus: Your versatile Visual AI assistant
☆300Updated last month
rohit901 / cooperative-foundational-models
[WACV 2025] Official code for our paper "Enhancing Novel Object Detection via Cooperative Foundational Models"
☆83Updated last month
Christinepan881 / DINO-R1
☆51Updated 4 months ago
linhuixiao / Awesome-Visual-Grounding
[TPAMI 2025] Towards Visual Grounding: A Survey
☆261Updated last week
FoundationVision / GenerateU
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
☆186Updated 8 months ago
stevebottos / owl-vit-object-detection
object detection based on owl-vit
☆67Updated 2 years ago
IDEA-Research / ChatRex
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
☆208Updated last month
OpenGVLab / InternVL-MMDetSeg
Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed
☆106Updated last year