IDEA-Research / Rex-Omni
View external linksLinks

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

☆1,128

Alternatives and similar repositories for Rex-Omni

Users that are interested in Rex-Omni are comparing it to the libraries listed below

Sorting:

IDEA-Research / T-Rex
View on GitHub
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
☆2,630Oct 15, 2025Updated 3 months ago
THU-MIG / yoloe
View on GitHub
YOLOE: Real-Time Seeing Anything [ICCV 2025]
☆2,029Jun 26, 2025Updated 7 months ago
IDEA-Research / Rex-Thinker
View on GitHub
Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning
☆142Jun 30, 2025Updated 7 months ago
wanghao9610 / OV-DINO
View on GitHub
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
☆398Mar 12, 2025Updated 11 months ago
om-ai-lab / VLM-FO1
View on GitHub
VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs
☆237Nov 28, 2025Updated 2 months ago
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆597Jan 17, 2026Updated 3 weeks ago
IDEA-Research / RexSeek
View on GitHub
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
☆177Oct 15, 2025Updated 3 months ago
AILab-CVC / YOLO-World
View on GitHub
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
☆6,208Feb 26, 2025Updated 11 months ago
Liuziyu77 / Visual-RFT
View on GitHub
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
☆2,316Oct 29, 2025Updated 3 months ago
IDEA-Research / DINO-X-API
View on GitHub
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
☆1,334Jul 23, 2025Updated 6 months ago
om-ai-lab / VLM-R1
View on GitHub
Solve Visual Understanding with Reinforced VLMs
☆5,833Oct 21, 2025Updated 3 months ago
IDEA-Research / ChatRex
View on GitHub
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
☆210Oct 15, 2025Updated 3 months ago
waldo-j / spam
View on GitHub
☆35Sep 29, 2025Updated 4 months ago
UX-Decoder / DINOv
View on GitHub
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
☆529Apr 8, 2024Updated last year
lyuwenyu / RT-DETR
View on GitHub
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥…
☆4,841Dec 3, 2025Updated 2 months ago
IDEA-Research / Grounded-Segment-Anything
View on GitHub
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …
☆17,397Sep 5, 2024Updated last year
facebookresearch / dinov3
View on GitHub
Reference PyTorch implementation and models for DINOv3
☆9,525Nov 20, 2025Updated 2 months ago
vesselgpt / vessel
View on GitHub
☆10Feb 14, 2025Updated last year
NVlabs / RADIO
View on GitHub
Official repository for "AM-RADIO: Reduce All Domains Into One"
☆1,634Updated this week
Intellindust-AI-Lab / DEIMv2
View on GitHub
[DEIMv2] Real Time Object Detection Meets DINOv3
☆1,478Jan 7, 2026Updated last month
FoundationVision / GLEE
View on GitHub
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
☆1,169Oct 21, 2024Updated last year
EvolvingLMMs-Lab / LLaVA-OneVision-1.5
View on GitHub
Fully Open Framework for Democratized Multimodal Training
☆718Dec 27, 2025Updated last month
OpenGVLab / InternVL
View on GitHub
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
☆9,792Sep 22, 2025Updated 4 months ago
IDEA-Research / GroundingDINO
View on GitHub
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
☆9,694Aug 12, 2024Updated last year
aemior / UMatcher
View on GitHub
UMatcher: A modern template matching model
☆78May 31, 2025Updated 8 months ago
yformer / EfficientTAM
View on GitHub
Efficient Track Anything
☆776Jan 6, 2025Updated last year
Peterande / D-FINE
View on GitHub
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
☆3,008Jan 5, 2026Updated last month
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆18,273Jan 30, 2026Updated 2 weeks ago
Intellindust-AI-Lab / DEIM
View on GitHub
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
☆1,428Sep 26, 2025Updated 4 months ago
360CVGroup / FG-CLIP
View on GitHub
New generation of CLIP with fine grained discrimination capability, ICML2025
☆545Oct 27, 2025Updated 3 months ago
NVlabs / describe-anything
View on GitHub
[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
☆1,448Jun 26, 2025Updated 7 months ago
jerpelhan / GECO2
View on GitHub
☆80Jan 18, 2026Updated 3 weeks ago
ByteDance-Seed / Seed1.5-VL
View on GitHub
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…
☆1,544Jun 14, 2025Updated 8 months ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆123Jan 30, 2026Updated 2 weeks ago
siyuanliii / masa
View on GitHub
Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything
☆1,362May 1, 2025Updated 9 months ago
xinyu1205 / recognize-anything
View on GitHub
Open-source and strong foundation image recognition models.
☆3,589Feb 18, 2025Updated 11 months ago
IDEA-Research / SegDINO3D
View on GitHub
[AAAI 2026] Official implementation of the paper ”SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D F…
☆24Jan 8, 2026Updated last month
cambrian-mllm / cambrian
View on GitHub
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
☆1,985Nov 7, 2025Updated 3 months ago
tsingmicro-toolchain / OnnxSlim
View on GitHub
A Toolkit to Help Optimize Large Onnx Model
☆164Oct 26, 2025Updated 3 months ago

IDEA-Research / Rex-OmniView external linksLinks

Alternatives and similar repositories for Rex-Omni

IDEA-Research / Rex-Omni
View external linksLinks