[CVPR2026] Detect Anything via Next Point Prediction
☆1,280Feb 22, 2026Updated last month
Alternatives and similar repositories for Rex-Omni
Users that are interested in Rex-Omni are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy☆2,647Oct 15, 2025Updated 5 months ago
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆399Mar 12, 2025Updated last year
- VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs☆286Mar 12, 2026Updated 3 weeks ago
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆2,099Jun 26, 2025Updated 9 months ago
- pytorch implementation of "Efficiently Reconstructing Dynamic Scenes One 🎯 D4RT at a Time"☆52Jan 27, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆146Jun 30, 2025Updated 9 months ago
- [AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning☆114Dec 3, 2025Updated 4 months ago
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆6,290Feb 26, 2025Updated last year
- Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"☆620Jan 17, 2026Updated 2 months ago
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆180Oct 15, 2025Updated 5 months ago
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding☆1,352Jul 23, 2025Updated 8 months ago
- Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’☆2,287Oct 29, 2025Updated 5 months ago
- Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …☆17,499Sep 5, 2024Updated last year
- Reference PyTorch implementation and models for DINOv3☆10,057Mar 30, 2026Updated last week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Solve Visual Understanding with Reinforced VLMs☆5,935Mar 12, 2026Updated 3 weeks ago
- Official repository for "AM-RADIO: Reduce All Domains Into One"☆1,745Mar 30, 2026Updated last week
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!☆2,242Mar 12, 2026Updated 3 weeks ago
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"☆531Apr 8, 2024Updated 2 years ago
- [DEIMv2] Real Time Object Detection Meets DINOv3☆1,651Mar 24, 2026Updated 2 weeks ago
- [CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥…☆5,036Mar 2, 2026Updated last month
- [CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale☆1,173Oct 21, 2024Updated last year
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆271Nov 5, 2025Updated 5 months ago
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆9,978Aug 12, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,365May 1, 2025Updated 11 months ago
- Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆18,917Jan 30, 2026Updated 2 months ago
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆310Jun 25, 2025Updated 9 months ago
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆9,949Sep 22, 2025Updated 6 months ago
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆168Feb 21, 2026Updated last month
- ☆113Jan 18, 2026Updated 2 months ago
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆1,995Nov 7, 2025Updated 5 months ago
- [CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆181Dec 13, 2024Updated last year
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,567Jun 14, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Pytorch implementation of "EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation"☆117Apr 2, 2026Updated last week
- Open-source and strong foundation image recognition models.☆3,614Feb 18, 2025Updated last year
- Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"☆298Mar 20, 2026Updated 3 weeks ago
- Fully Open Framework for Democratized Multimodal Training☆788Dec 27, 2025Updated 3 months ago
- Efficient Track Anything☆794Jan 6, 2025Updated last year
- [NeurIPS 2025] Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learning☆290Jul 15, 2025Updated 8 months ago
- [ICRA 2025] Official repository for "UASTHN: Uncertainty-Aware Deep Homography Estimation for UAV Satellite-Thermal Geo-localization"☆22Feb 28, 2026Updated last month