mk-minchul / sapiensidLinks
☆19Updated last month
Alternatives and similar repositories for sapiensid
Users that are interested in sapiensid are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception☆44Updated last year
- Official implementation of Faceptor: A Generalist Model for Face Perception.☆47Updated last year
- [ECCV 2024 Oral] PetFace: A Large-Scale Dataset and Benchmark for Animal Identification https://arxiv.org/abs/2407.13555☆87Updated 5 months ago
- [NeurIPS2024] Cross-video Identity Correlating for Person Re-identification Pre-training☆98Updated 6 months ago
- ENTIRe-ID☆25Updated last year
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆177Updated 2 months ago
- The official PyTorch implementation of Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning - CVPR 2023☆12Updated last year
- [ACM MM2025] The official repository for the RealSyn dataset☆39Updated 3 weeks ago
- Scaling Vision Pre-Training to 4K Resolution☆217Updated last week
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆131Updated 6 months ago
- ☆22Updated last year
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆148Updated this week
- Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆86Updated 3 weeks ago
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets☆60Updated 5 months ago
- ☆53Updated last year
- [ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"☆10Updated last year
- (ICCV 2025) ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations☆126Updated last month
- CAPE using text-graphs☆28Updated 9 months ago
- InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024☆89Updated last year
- Official Implementation of "VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning".☆59Updated last month
- ☆104Updated 2 weeks ago
- [ICCV 2023] The official PyTorch code for Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation☆90Updated 2 years ago
- ☆24Updated last year
- Official code for CAVIS: Context-Aware Video Instance Segmentation☆94Updated 3 months ago
- (NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection☆123Updated last year
- PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification☆40Updated 7 months ago
- [ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting☆122Updated last year
- Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"☆127Updated 3 weeks ago
- ☆18Updated 10 months ago
- Recognize Any Regions☆122Updated last year