hkchengrex / Grounded-Segment-Anything
Grounded-SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
☆39Updated last year
Alternatives and similar repositories for Grounded-Segment-Anything:
Users that are interested in Grounded-Segment-Anything are comparing it to the libraries listed below
- Code for the paper: "ODIN: A Single Model for 2D and 3D Segmentation" (CVPR 2024)☆136Updated 2 months ago
- Code for the paper "pix2gestalt: Amodal Segmentation by Synthesizing Wholes" (CVPR 2024)☆153Updated 8 months ago
- Grounded Tracking for Streaming Videos☆76Updated 3 months ago
- [ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation☆363Updated last year
- ☆152Updated 7 months ago
- Muggled SAM: Segmentation without the magic☆96Updated last week
- [ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything an…☆222Updated 3 weeks ago
- ☆221Updated 7 months ago
- ☆61Updated 6 months ago
- CAVIS: Context-Aware Video Instance Segmentation☆72Updated last month
- ☆214Updated last month
- ☆101Updated 6 months ago
- Combining OwlViT with Segment Anything - Open-vocabulary Detection and Segmentation (Text-conditioned, and Image-conditioned)☆158Updated last year
- [NeurIPS 2023] HASSOD: Hierarchical Adaptive Self-Supervised Object Detection☆54Updated 11 months ago
- HANDAL Dataset and Pipeline☆74Updated 7 months ago
- Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"☆198Updated last year
- Official PyTorch implementation of Self-Supervised Any-Point Tracking by Contrastive Random Walks, ECCV 2024.☆46Updated 2 months ago
- ☆165Updated 4 months ago
- Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23☆81Updated 9 months ago
- [NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation☆76Updated 7 months ago
- [NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"☆100Updated last month
- [ICCV2023] EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding☆75Updated last year
- ☆95Updated last year
- [ECCV 2024] Improving 2D Feature Representations by 3D-Aware Fine-Tuning☆257Updated 2 months ago
- [ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training☆122Updated 8 months ago
- Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"☆61Updated 5 months ago
- Official Implementation for "Matching Is Not Enough: A Two-Stage Framework for Category-Agnostic Pose Estimation", CVPR 2023.☆49Updated last year
- Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.☆107Updated 5 months ago
- [NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding☆69Updated last month
- Theia: Distilling Diverse Vision Foundation Models for Robot Learning☆202Updated 3 months ago