[CVPR2026] Detect Anything via Next Point Prediction
☆1,464Feb 22, 2026Updated 4 months ago
Alternatives and similar repositories for Rex-Omni
Users that are interested in Rex-Omni are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy☆2,683Oct 15, 2025Updated 8 months ago
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆406Mar 12, 2025Updated last year
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆2,180Jun 26, 2025Updated last year
- VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs☆322Jun 18, 2026Updated 2 weeks ago
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆6,435Feb 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆150Jun 30, 2025Updated last year
- pytorch implementation of "Efficiently Reconstructing Dynamic Scenes One 🎯 D4RT at a Time"☆69Jun 15, 2026Updated 2 weeks ago
- Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"☆632Jan 17, 2026Updated 5 months ago
- [AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning☆134Dec 3, 2025Updated 6 months ago
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆184Oct 15, 2025Updated 8 months ago
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding☆1,393Jul 23, 2025Updated 11 months ago
- Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’☆2,250Oct 29, 2025Updated 8 months ago
- Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …☆17,651Sep 5, 2024Updated last year
- [CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale☆1,172Oct 21, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Reference PyTorch implementation and models for DINOv3☆10,753Jun 15, 2026Updated 2 weeks ago
- Solve Visual Understanding with Reinforced VLMs☆5,991Mar 12, 2026Updated 3 months ago
- [DEIMv2] Real Time Object Detection Meets DINOv3☆1,891Mar 24, 2026Updated 3 months ago
- Official repository for "AM-RADIO: Reduce All Domains Into One"☆1,877May 29, 2026Updated last month
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆10,337Aug 12, 2024Updated last year
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆277Nov 5, 2025Updated 7 months ago
- [CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥…☆5,327Jun 15, 2026Updated 2 weeks ago
- Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"☆311Apr 14, 2026Updated 2 months ago
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!☆2,306Apr 13, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"☆540Apr 8, 2024Updated 2 years ago
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,376May 1, 2025Updated last year
- UMatcher: A modern template matching model☆86May 31, 2025Updated last year
- D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]☆3,180Apr 6, 2026Updated 2 months ago
- Open-source and strong foundation image recognition models.☆3,679Feb 18, 2025Updated last year
- Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆19,499Jan 30, 2026Updated 5 months ago
- [WACV 2026] Official implementation of the paper: “CountingDINO: A Training-free Pipeline for Exemplar-based Class-Agnostic Counting”☆62Jun 22, 2026Updated last week
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,580Jun 14, 2025Updated last year
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆10,074Sep 22, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Fully Open Framework for Democratized Multimodal Training☆1,115Updated this week
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆325Jun 25, 2025Updated last year
- Efficient Track Anything☆812Jan 6, 2025Updated last year
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆2,005Nov 7, 2025Updated 7 months ago
- [CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆181Dec 13, 2024Updated last year
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆230Jun 7, 2026Updated 3 weeks ago
- Effortless data labeling with AI support from Segment Anything and other awesome models.☆9,589Updated this week