TeleeMa / SADE
An Examination of the Compositionality of Large Generative Vision-Language Models
☆19Updated 11 months ago
Alternatives and similar repositories for SADE:
Users that are interested in SADE are comparing it to the libraries listed below
- Official PyTorch Implementation of Learning Affordance Grounding from Exocentric Images, CVPR 2022☆54Updated 4 months ago
- [NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation☆79Updated 8 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated 5 months ago
- A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.☆97Updated 2 years ago
- OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding☆18Updated this week
- Affordance Grounding from Demonstration Video to Target Image (CVPR 2023)☆43Updated 7 months ago
- LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding (CVPR 2023)☆36Updated last year
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆65Updated 5 months ago
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds☆53Updated 2 years ago
- ☆57Updated last year
- ☆17Updated 3 years ago
- Official implementation of Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (NeurIPS'22).☆59Updated 2 years ago
- ☆24Updated 3 years ago
- Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"☆73Updated 7 months ago
- Official Implementation of Frequency-enhanced Data Augmentation for Vision-and-Language Navigation (NeurIPS2023)☆14Updated last year
- ☆25Updated last year
- [ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds☆42Updated 2 years ago
- This repository is the official implementation of Improving Object-centric Learning With Query Optimization☆50Updated last year
- [ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models☆42Updated 2 months ago
- [NeurIPS 2024] Official code for paper "EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection"☆27Updated 3 months ago
- ☆48Updated 5 months ago
- SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)☆32Updated 3 years ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆44Updated 2 weeks ago
- Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…☆28Updated last year
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated 11 months ago
- CVPR 2024 "Instance Tracking in 3D Scenes from Egocentric Videos"☆18Updated 8 months ago
- Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)☆32Updated 6 months ago
- [ICLR 2023] SQA3D for embodied scene understanding and reasoning☆127Updated last year
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆33Updated 10 months ago
- This is the code related to "Context-aware Alignment and Mutual Masking for 3D-Language Pre-training" (CVPR 2023).☆27Updated last year