facebookresearch / sapiensLinks
High-resolution models for human tasks.
☆5,038Updated 6 months ago
Alternatives and similar repositories for sapiens
Users that are interested in sapiens are comparing it to the libraries listed below
Sorting:
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆4,519Updated last month
- [CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation☆7,566Updated 10 months ago
- CoTracker is a model for tracking any point (pixel) on a video.☆4,328Updated 4 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆15,733Updated 5 months ago
- [NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation☆5,635Updated 4 months ago
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"☆6,829Updated 2 months ago
- InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models☆3,889Updated 5 months ago
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆5,549Updated 3 months ago
- TripoSR: Fast 3D Object Reconstruction from a Single Image☆5,434Updated 9 months ago
- [CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation☆2,787Updated 3 weeks ago
- [NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image☆3,421Updated 5 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,295Updated this week
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆10,233Updated this week
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆2,231Updated last week
- [ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior☆2,934Updated last month
- 4M: Massively Multimodal Masked Modeling☆1,727Updated this week
- DUSt3R: Geometric 3D Vision Made Easy☆6,311Updated 2 weeks ago
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆4,225Updated 2 weeks ago
- Official Code for Stable Cascade☆6,595Updated 10 months ago
- tiny vision language model☆8,040Updated last week
- Tracking Any Point (TAP)☆1,537Updated 2 weeks ago
- Grounding Image Matching in 3D with MASt3R☆2,218Updated last week
- PyTorch code and models for V-JEPA self-supervised learning from video.☆3,013Updated 3 months ago
- New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos☆7,999Updated last month
- The repo for "Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator"☆589Updated last month
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy☆2,511Updated last month
- ☆3,529Updated this week
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,420Updated 4 months ago
- Stable Virtual Camera: Generative View Synthesis with Diffusion Models☆1,300Updated this week
- [ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling☆2,956Updated 5 months ago