yangchris11 / samuraiLinks
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
☆7,030Updated 9 months ago
Alternatives and similar repositories for samurai
Users that are interested in samurai are comparing it to the libraries listed below
Sorting:
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆5,135Updated 8 months ago
- CoTracker is a model for tracking any point (pixel) on a video.☆4,730Updated 11 months ago
- [NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation☆7,293Updated 11 months ago
- High-resolution models for human tasks.☆5,255Updated last year
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆1,967Updated 6 months ago
- RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO and designed for fine-tun…☆4,996Updated last month
- [CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation☆7,931Updated last year
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆3,147Updated last month
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆6,117Updated 10 months ago
- Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders (CVPR 2025, Highlight)☆807Updated 8 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆18,185Updated last year
- A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms☆2,197Updated last week
- The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading t…☆6,486Updated last week
- Images to inference with no labeling (use foundation models to train supervised models).☆2,553Updated 7 months ago
- YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]☆11,156Updated 9 months ago
- [NeurIPS 2025] SpatialLM: Training Large Language Models for Structured Indoor Modeling☆4,150Updated 3 months ago
- [CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation☆1,435Updated last month
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,360Updated 7 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,706Updated last month
- computer vision and sports☆4,804Updated last month
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,649Updated last week
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding☆1,313Updated 5 months ago
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆4,847Updated last week
- [CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"☆845Updated 3 weeks ago
- [CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation☆3,035Updated 3 weeks ago
- [CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation☆3,036Updated 2 weeks ago
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy☆2,609Updated 2 months ago
- [CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos☆1,648Updated 2 months ago
- [CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors☆2,722Updated last month
- Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"☆1,563Updated 6 months ago