ZhanYang-nwpu/Mono3DVG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZhanYang-nwpu/Mono3DVG)

ZhanYang-nwpu / Mono3DVG

[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024

☆72

Alternatives and similar repositories for Mono3DVG

Users that are interested in Mono3DVG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GuanRunwei / Talk2Radar
View on GitHub
☆46Dec 29, 2025Updated 6 months ago
qzp2018 / MCLN
View on GitHub
This is a PyTorch implementation of MCLN proposed by our paper "Multi-branch Collaborative Learning Network for 3D Visual Grounding"(ECCV…
☆27Oct 10, 2024Updated last year
liudaizong / Awesome-3D-Visual-Grounding
View on GitHub
😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.
☆282Jan 14, 2026Updated 6 months ago
ZCMax / ScanReason
View on GitHub
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
☆85Oct 10, 2024Updated last year
ZhanYang-nwpu / PE-RSITR
View on GitHub
Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval, 2023
☆29Jan 14, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ZhanYang-nwpu / Awesome-Multimodal-Large-Language-Models-for-UAV-Vision-Language-Perception
View on GitHub
UAV-MLLMs
☆29Apr 7, 2026Updated 3 months ago
jxbbb / TOD3Cap
View on GitHub
[ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
☆132Mar 1, 2025Updated last year
ZzZZCHS / WS-3DVG
View on GitHub
[ICCV 2023] Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
☆14Oct 2, 2024Updated last year
lingli1996 / GLOBE
View on GitHub
[NeurIPS 2025] Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models
☆18Apr 1, 2026Updated 3 months ago
CVRP-SOLE / SOLE
View on GitHub
[ICLR 2025] Official code of "Segment any 3D Object with Language"
☆73Apr 14, 2026Updated 3 months ago
Pointcept / OpenIns3D
View on GitHub
[ECCV'24] OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
☆207Oct 19, 2024Updated last year
referit3d / referit3d
View on GitHub
Code accompanying our ECCV-2020 paper on 3D Neural Listeners.
☆141Jun 29, 2021Updated 5 years ago
Haochen-Wang409 / ross3d
View on GitHub
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆70Jul 22, 2025Updated 11 months ago
vobecant / POP3D
View on GitHub
Source code for NeurIPS paper "POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images"
☆122Jan 7, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
AronCao49 / Latte
View on GitHub
[ECCV 2024] Reliable Spatial-Temporal Voxels for Multi-Modal Test-Time Adaptation
☆18Jan 12, 2026Updated 6 months ago
zlccccc / 3DVL_Codebase
View on GitHub
[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
☆57Jan 29, 2023Updated 3 years ago
PKU-EPIC / MaskClustering
View on GitHub
[CVPR 24] MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation
☆129Apr 25, 2024Updated 2 years ago
wzzheng / OccSora
View on GitHub
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
☆204May 31, 2024Updated 2 years ago
nazmul-karim170 / Free-Editor
View on GitHub
[ECCV'24] PyTorch Implementation of "Free-Editor: Zero-shot Text-driven 3D Scene Editing"
☆20Dec 5, 2024Updated last year
leolyj / 3D-VLP
View on GitHub
This is the code related to "Context-aware Alignment and Mutual Masking for 3D-Language Pre-training" (CVPR 2023).
☆29Jun 15, 2023Updated 3 years ago
NorthSummer / SliceOcc
View on GitHub
☆29Jan 27, 2025Updated last year
3dlg-hcvc / multi3drefer
View on GitHub
[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects
☆98Mar 26, 2026Updated 3 months ago
songw-zju / Scribble2Scene
View on GitHub
The official implementation of "Label-efficient Semantic Scene Completion with Scribble Annotations" (IJCAI 2024)
☆14Jul 27, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
heshuting555 / SegPoint
View on GitHub
☆38Jul 19, 2024Updated 2 years ago
yanmin-wu / EDA
View on GitHub
[CVPR 2023] EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
☆134Oct 11, 2023Updated 2 years ago
uvavision / SelfEQ
View on GitHub
[CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".
☆28Mar 1, 2024Updated 2 years ago
sega-hsj / MVT-3DVG
View on GitHub
[CVPR 2022] Multi-View Transformer for 3D Visual Grounding
☆81Nov 9, 2022Updated 3 years ago
YunzeMan / Situation3D
View on GitHub
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
☆44Dec 9, 2024Updated last year
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
UVA-Computer-Vision-Lab / ovmono3d
View on GitHub
[3DV 2026] Open Vocabulary Monocular 3D Object Detection
☆98Apr 29, 2026Updated 2 months ago
fengyi233 / ViPOcc
View on GitHub
☆38Feb 16, 2025Updated last year
CurryYuan / ZSVG3D
View on GitHub
[CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
☆63Aug 3, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
OpenSpaceAI / DEST3D
View on GitHub
PyTorch implementation for our ICLR 2025 paper State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
☆43Mar 27, 2025Updated last year
NickHezhuolin / OS-Det3D
View on GitHub
☆17Jun 29, 2026Updated 3 weeks ago
pqh22 / ProxyTransformation
View on GitHub
[CVPR2025] ProxyTransformation : Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
☆50Sep 2, 2025Updated 10 months ago
ZrrSkywalker / MonoDETR
View on GitHub
[ICCV 2023] The first DETR model for monocular 3D object detection with depth-guided transformer
☆444Jul 15, 2025Updated last year
ZhengJun-AI / vfid-metrics
View on GitHub
A toolkit for computing Video Fréchet Inception Distance (VFID) metrics.
☆11May 28, 2024Updated 2 years ago
LaVi-Lab / Video-3D-LLM
View on GitHub
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆218Jun 4, 2025Updated last year
iris0329 / SeeGround
View on GitHub
[CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
☆222Apr 21, 2025Updated last year