[CVPR2026] Detect Anything via Next Point Prediction
☆1,428Feb 22, 2026Updated 3 months ago
Alternatives and similar repositories for Rex-Omni
Users that are interested in Rex-Omni are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy☆2,677Oct 15, 2025Updated 7 months ago
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆404Mar 12, 2025Updated last year
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆2,161Jun 26, 2025Updated 11 months ago
- VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs☆310Updated this week
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆6,394Feb 26, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆149Jun 30, 2025Updated 11 months ago
- pytorch implementation of "Efficiently Reconstructing Dynamic Scenes One 🎯 D4RT at a Time"☆62Jan 27, 2026Updated 4 months ago
- Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"☆630Jan 17, 2026Updated 4 months ago
- [AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning☆131Dec 3, 2025Updated 6 months ago
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆183Oct 15, 2025Updated 7 months ago
- Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’☆2,246Oct 29, 2025Updated 7 months ago
- Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …☆17,627Sep 5, 2024Updated last year
- [CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale☆1,172Oct 21, 2024Updated last year
- Reference PyTorch implementation and models for DINOv3☆10,598Jun 3, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Solve Visual Understanding with Reinforced VLMs☆5,966Mar 12, 2026Updated 2 months ago
- [DEIMv2] Real Time Object Detection Meets DINOv3☆1,819Mar 24, 2026Updated 2 months ago
- Official repository for "AM-RADIO: Reduce All Domains Into One"☆1,857May 29, 2026Updated last week
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆10,222Aug 12, 2024Updated last year
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆273Nov 5, 2025Updated 7 months ago
- [CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥…☆5,267May 20, 2026Updated 3 weeks ago
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!☆2,295Apr 13, 2026Updated last month
- Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"☆308Apr 14, 2026Updated last month
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"☆539Apr 8, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,377May 1, 2025Updated last year
- UMatcher: A modern template matching model☆82May 31, 2025Updated last year
- D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]☆3,162Apr 6, 2026Updated 2 months ago
- Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆19,304Jan 30, 2026Updated 4 months ago
- Open-source and strong foundation image recognition models.☆3,658Feb 18, 2025Updated last year
- [WACV 2026] Official implementation of the paper: “CountingDINO: A Training-free Pipeline for Exemplar-based Class-Agnostic Counting”☆59Mar 8, 2026Updated 3 months ago
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,577Jun 14, 2025Updated 11 months ago
- ☆10Feb 14, 2025Updated last year
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆10,055Sep 22, 2025Updated 8 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Fully Open Framework for Democratized Multimodal Training☆1,057Updated this week
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆321Jun 25, 2025Updated 11 months ago
- Efficient Track Anything☆803Jan 6, 2025Updated last year
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆2,003Nov 7, 2025Updated 7 months ago
- [CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆181Dec 13, 2024Updated last year
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆213Updated this week
- Effortless data labeling with AI support from Segment Anything and other awesome models.☆9,313May 24, 2026Updated 2 weeks ago