yangchris11 / samurai
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
☆6,512Updated this week
Alternatives and similar repositories for samurai:
Users that are interested in samurai are comparing it to the libraries listed below
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆14,142Updated last month
- High-resolution models for human tasks.☆4,831Updated 3 months ago
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆4,122Updated 4 months ago
- Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accele…☆7,540Updated last week
- Official repository for LTX-Video☆2,857Updated this week
- [NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation☆4,646Updated 3 weeks ago
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆3,413Updated last week
- Code of Pyramidal Flow Matching for Efficient Video Generative Modeling☆2,784Updated 2 months ago
- Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders☆493Updated last month
- CoTracker is a model for tracking any point (pixel) on a video.☆4,136Updated last month
- 🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.☆11,162Updated this week
- [CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation☆7,297Updated 7 months ago
- 🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning☆9,169Updated this week
- Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything☆1,203Updated 3 months ago
- Official inference repo for FLUX.1 models☆20,331Updated 2 weeks ago
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆1,688Updated 2 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,506Updated this week
- Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your resea…☆3,549Updated 3 weeks ago
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,373Updated this week
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆8,621Updated this week
- [CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation☆1,869Updated last week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆2,916Updated last week
- The best OSS video generation models☆2,915Updated last month
- A generative world for general-purpose robotics & embodied AI learning.☆23,918Updated this week
- DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 S…☆1,731Updated 2 months ago
- [SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation☆5,636Updated 5 months ago