VincentDENGP / 3D-LRLinks
Can 3D Vision-Language Models Truly Understand Natural Language?
☆21Updated last year
Alternatives and similar repositories for 3D-LR
Users that are interested in 3D-LR are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation☆85Updated last year
- Egocentric Video Understanding Dataset (EVUD)☆29Updated 11 months ago
- ☆36Updated 2 years ago
- ☆58Updated last year
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆36Updated last year
- [CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…☆43Updated last week
- Official implementation of "Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness".☆29Updated 2 weeks ago
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆74Updated 8 months ago
- VisualGPTScore for visio-linguistic reasoning☆27Updated last year
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds☆55Updated 2 years ago
- (ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation☆47Updated 11 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆57Updated 9 months ago
- The offical implemention of JM3D.☆30Updated 2 months ago
- ☆13Updated 2 months ago
- ☆11Updated 8 months ago
- Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"☆12Updated 2 weeks ago
- ☆37Updated last year
- (NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights☆27Updated 7 months ago
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)☆44Updated last year
- ☆49Updated 8 months ago
- Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples☆23Updated 6 months ago
- [ICLR'25] Reconstructive Visual Instruction Tuning☆92Updated 2 months ago
- ☆25Updated 2 months ago
- [ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models☆47Updated 5 months ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆60Updated last week
- Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling @ CVPR22☆42Updated 2 years ago
- ☆71Updated 6 months ago
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆40Updated 3 weeks ago
- ☆25Updated last year
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆37Updated 4 months ago