TACJu / Compositor
This repo contains the code for our paper Compositor: Bottom-Up Clustering and Compositing for Robust Part and Object Segmentation
☆15Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for Compositor
- [NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation☆72Updated 4 months ago
- [ECCV2024] PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects☆26Updated last month
- [IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)☆20Updated 2 years ago
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆22Updated last week
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)☆37Updated 10 months ago
- ☆12Updated 3 months ago
- [ICCV 2023] HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation☆29Updated 9 months ago
- This is the code related to "Context-aware Alignment and Mutual Masking for 3D-Language Pre-training" (CVPR 2023).☆25Updated last year
- Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling @ CVPR22☆42Updated 2 years ago
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds☆53Updated last year
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆27Updated 6 months ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated 7 months ago
- A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.☆95Updated last year
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated last month
- This is the official implementation for our paper;"LAR:Look Around and Refer".☆27Updated last year
- SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)☆31Updated 3 years ago
- ☆22Updated 2 years ago
- [ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds☆39Updated 2 years ago
- Large-Vocabulary Video Instance Segmentation dataset☆76Updated 4 months ago
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆61Updated 5 months ago
- Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations ICCV23☆26Updated 3 weeks ago
- Official implementation of PARIS3D (Accepted to ECCV 2024).☆18Updated last month
- Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"☆80Updated last year
- ☆15Updated 4 months ago
- ☆11Updated 4 months ago
- ☆32Updated 2 years ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆41Updated 3 months ago
- Multi-View Transformer for 3D Visual Grounding [CVPR 2022]☆66Updated 2 years ago
- Official Code for the NeurIPS'23 paper "3D-Aware Visual Question Answering about Parts, Poses and Occlusions"☆14Updated 3 weeks ago
- ☆57Updated last year