PQ3D/PQ3D

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PQ3D/PQ3D)

PQ3D / PQ3D

Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"

☆85

Alternatives and similar repositories for PQ3D

Users that are interested in PQ3D are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZCMax / ScanReason
View on GitHub
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
☆85Oct 10, 2024Updated last year
sg-3d / sg3d
View on GitHub
☆55Oct 3, 2024Updated last year
scene-verse / SceneVerse
View on GitHub
Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"
☆288Mar 19, 2025Updated last year
KuanchihHuang / Reason3D
View on GitHub
[3DV 2025] Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
☆124May 30, 2025Updated last year
InternRobotics / Grounded_3D-LLM
View on GitHub
Code&Data for Grounded 3D-LLM with Referent Tokens
☆136Jan 5, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
3d-vista / 3D-VisTA
View on GitHub
Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"
☆215Sep 7, 2023Updated 2 years ago
ZzZZCHS / Chat-Scene
View on GitHub
[NeurIPS 2024 & TPAMI 2026] Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
☆216Apr 12, 2026Updated 3 months ago
YunzeMan / Situation3D
View on GitHub
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
☆44Dec 9, 2024Updated last year
dk-liang / UniSeg3D
View on GitHub
[NeurIPS 2024] A Unified Framework for 3D Scene Understanding
☆179Jul 7, 2025Updated last year
embodied-generalist / embodied-generalist
View on GitHub
[ICML 2024] LEO: An Embodied Generalist Agent in 3D World
☆485Apr 20, 2025Updated last year
MSR3D / MSR3D
View on GitHub
[NeurIPS 2024] MSR3D: Multimodal Situated Reasoning in 3D Scenes
☆75Dec 2, 2025Updated 7 months ago
Open3DA / LL3DA
View on GitHub
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Langu…
☆319Jul 17, 2024Updated 2 years ago
facebookresearch / univlg
View on GitHub
Unifying 2D and 3D Vision-Language Understanding
☆126Jul 2, 2026Updated 3 weeks ago
InternRobotics / EmbodiedScan
View on GitHub
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
☆672Jun 13, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SceneFun3D / scenefun3d
View on GitHub
SceneFun3D ToolKit
☆180Apr 17, 2025Updated last year
beacon-3d / Beacon3D
View on GitHub
[CVPR 2025] Beacon3D: Object-centric Evaluation for 3D Grounding-QA
☆28Nov 25, 2025Updated 7 months ago
lslrh / DMA
View on GitHub
Official code of DMA: Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding, ECCV 2024
☆32Jul 18, 2024Updated 2 years ago
ZCMax / LLaVA-3D
View on GitHub
[ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
☆385Oct 21, 2025Updated 9 months ago
sosppxo / RG-SAN
View on GitHub
[NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
☆20Dec 22, 2024Updated last year
heshuting555 / RefMask3D
View on GitHub
[ACM MM-2024] RefMask3D: Language-Guided Transformer for 3D Referring Segmentation
☆65Jul 29, 2024Updated last year
chunfeng3364 / LARC
View on GitHub
☆19Jun 26, 2024Updated 2 years ago
Visual-AI / 3DRS
View on GitHub
[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
☆158Dec 9, 2025Updated 7 months ago
CurryYuan / PhraseRefer
View on GitHub
[TNNLS] Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
☆17Jul 10, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
VinAIResearch / Open3DIS
View on GitHub
Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)
☆135Nov 12, 2024Updated last year
boschresearch / Open3DSG
View on GitHub
[CVPR 2024] Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
☆166Sep 16, 2024Updated last year
SilongYong / SQA3D
View on GitHub
[ICLR 2023] SQA3D for embodied scene understanding and reasoning
☆168Oct 13, 2023Updated 2 years ago
MTU3D / MTU3D
View on GitHub
☆266Aug 6, 2025Updated 11 months ago
OpenM3D / M3DBench
View on GitHub
[ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.
☆61Oct 1, 2024Updated last year
LaVi-Lab / Video-3D-LLM
View on GitHub
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆218Jun 4, 2025Updated last year
SceneCOT / scenecot
View on GitHub
[ICLR 2026] SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes
☆27Mar 22, 2026Updated 4 months ago
PKU-EPIC / MaskClustering
View on GitHub
[CVPR 24] MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation
☆129Apr 25, 2024Updated 2 years ago
3dlg-hcvc / multi3drefer
View on GitHub
[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects
☆98Mar 26, 2026Updated 3 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
nickgkan / butd_detr
View on GitHub
Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
☆95Jun 9, 2023Updated 3 years ago
phermosilla / msm
View on GitHub
Official repostory of the paper: Masked Scene Modeling (CVPR 2025)
☆18Dec 13, 2025Updated 7 months ago
YunzeMan / Lexicon3D
View on GitHub
[NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
☆102Feb 2, 2025Updated last year
ayushjain1144 / odin
View on GitHub
Code for the paper: "ODIN: A Single Model for 2D and 3D Segmentation" (CVPR 2024)
☆177Feb 27, 2026Updated 4 months ago
PhyScene / PhyScene
View on GitHub
Code implementation of CVPR 2024 highlight paper "PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI"
☆209Jun 9, 2025Updated last year
baaivision / Uni3D
View on GitHub
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
☆677Jan 12, 2026Updated 6 months ago
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago