ZhanYang-nwpu / Mono3DVGView external linksLinks
[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024
β66Apr 9, 2024Updated last year
Alternatives and similar repositories for Mono3DVG
Users that are interested in Mono3DVG are comparing it to the libraries listed below
Sorting:
- β42Dec 29, 2025Updated last month
- π up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.β261Jan 14, 2026Updated last month
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilitiesβ80Oct 10, 2024Updated last year
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenesβ129Mar 1, 2025Updated 11 months ago
- [ACM MM 2025] EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Samplerβ26Aug 7, 2025Updated 6 months ago
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptioβ¦β77Jan 5, 2026Updated last month
- [ECCV'24] OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentationβ205Oct 19, 2024Updated last year
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoningβ43Dec 9, 2024Updated last year
- β27Jan 27, 2025Updated last year
- [TNNLS 2023] This is official implementation of "PlaneSeg: Building a Plug-in for Boosting Planar Region Segmentation"β24Aug 27, 2023Updated 2 years ago
- [IROS2025]Adjacent-view Transformers for Supervised Surround-view Depth Estimationβ14Nov 14, 2025Updated 3 months ago
- Code accompanying our ECCV-2020 paper on 3D Neural Listeners.β138Jun 29, 2021Updated 4 years ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvementβ129Jul 24, 2025Updated 6 months ago
- [CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".β27Mar 1, 2024Updated last year
- [ICLR 2025] Official code of "Segment any 3D Object with Language"β67Oct 11, 2025Updated 4 months ago
- Fine tune LLaVA 1.5 - based on article by wandbβ13Feb 19, 2024Updated last year
- [CVPR 24] MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentationβ123Apr 25, 2024Updated last year
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awarenessβ64Jul 22, 2025Updated 6 months ago
- Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"β84Aug 2, 2024Updated last year
- [CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AIβ651Jun 13, 2025Updated 8 months ago
- [CVPR 2022] Multi-View Transformer for 3D Visual Groundingβ80Nov 9, 2022Updated 3 years ago
- [EMNLP'23] Code for 'Rethinking Negative Pairs in Code Search'β14Oct 17, 2023Updated 2 years ago
- [ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objectsβ94Oct 18, 2025Updated 3 months ago
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Cloudsβ57Jan 29, 2023Updated 3 years ago
- [CVPR 2023] EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Groundingβ132Oct 11, 2023Updated 2 years ago
- [ECCV24] Navigation Instruction Generation with BEV Perception and Large Language Modelsβ30Jul 16, 2024Updated last year
- This is the code related to "Context-aware Alignment and Mutual Masking for 3D-Language Pre-training" (CVPR 2023).β29Jun 15, 2023Updated 2 years ago
- [CVPR 2024] MonoCD: Monocular 3D Object Detection with Complementary Depthsβ55Jan 29, 2026Updated 2 weeks ago
- This is a tool for automatic image labeling using natural languageβ14Nov 27, 2023Updated 2 years ago
- [ECCV 2024] Reliable Spatial-Temporal Voxels for Multi-Modal Test-Time Adaptationβ16Jan 12, 2026Updated last month
- Code for "Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding" (ICCV 2023)β14Oct 2, 2024Updated last year
- Structure From Motion in 50 lines using OpenCVβ12May 31, 2021Updated 4 years ago
- [TNNLS] Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrasesβ16Jul 10, 2025Updated 7 months ago
- OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Drivingβ196May 31, 2024Updated last year
- [CVPR2024] Open-Vocabulary Semantic Segmentation with Image Embedding Balancingβ40Jan 12, 2026Updated last month
- [CVPR2025] ProxyTransformation : Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Groundingβ48Sep 2, 2025Updated 5 months ago
- β17Jul 18, 2024Updated last year
- (ICCV2023) Official implementation of 'ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance'β¦β59Apr 18, 2024Updated last year
- β37Feb 16, 2025Updated 11 months ago