mk-minchul / sapiensidLinks
☆15Updated 2 weeks ago
Alternatives and similar repositories for sapiensid
Users that are interested in sapiensid are comparing it to the libraries listed below
Sorting:
- Official implementation of Faceptor: A Generalist Model for Face Perception.☆47Updated last year
- [NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception☆43Updated last year
- [ACM MM2025] The official repository for the RealSyn dataset☆38Updated 5 months ago
- [ECCV 2024 Oral] PetFace: A Large-Scale Dataset and Benchmark for Animal Identification https://arxiv.org/abs/2407.13555☆80Updated 4 months ago
- ☆22Updated last year
- MLCD-Seg is a zero-shot segmentation model from DeepGlint.☆17Updated 5 months ago
- The official PyTorch implementation of Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning - CVPR 2023☆12Updated last year
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆172Updated last month
- ☆33Updated 3 weeks ago
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.☆78Updated 8 months ago
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆127Updated 5 months ago
- InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024☆86Updated last year
- [NeurIPS2024] Cross-video Identity Correlating for Person Re-identification Pre-training☆95Updated 5 months ago
- Official repo for 【FaceScore: Benchmarking and Enhancing Face Quality in Human Generation】☆80Updated 11 months ago
- Scaling Vision Pre-Training to 4K Resolution☆217Updated 3 months ago
- ☆195Updated 6 months ago
- Code for CLIB-FIQA: Face Image Quality Assessment with Confidence Calibration☆37Updated last month
- [ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"☆10Updated last year
- This is the official implementation of "Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors", which is accepted at…☆80Updated 6 months ago
- Official repository of the paper: "ID-Booth: Identity-consistent Face Generation with Diffusion Models"☆37Updated last month
- CAPE using text-graphs☆28Updated 8 months ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated last year
- ☆52Updated 2 years ago
- ☆24Updated last year
- A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.☆73Updated last year
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets☆55Updated 4 months ago
- Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]☆14Updated last year
- Official PyTorch implementation of UniHCP☆157Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"☆82Updated 9 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆94Updated 9 months ago