ATR-DBI/ScanQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ATR-DBI/ScanQA)

ATR-DBI / ScanQA

☆158

Alternatives and similar repositories for ScanQA

Users that are interested in ScanQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SilongYong / SQA3D
View on GitHub
[ICLR 2023] SQA3D for embodied scene understanding and reasoning
☆159Oct 13, 2023Updated 2 years ago
heng-hw / SpaCap3D
View on GitHub
[IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)
☆21Aug 31, 2022Updated 3 years ago
zlccccc / 3DVL_Codebase
View on GitHub
[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
☆57Jan 29, 2023Updated 3 years ago
nickgkan / butd_detr
View on GitHub
Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
☆94Jun 9, 2023Updated 2 years ago
daveredrum / D3Net
View on GitHub
[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
☆44Aug 27, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
3dlg-hcvc / multi3drefer
View on GitHub
[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects
☆96Mar 26, 2026Updated last month
cshizhe / vil3dref
View on GitHub
Official implementation of Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (NeurIPS'22).
☆67Dec 2, 2022Updated 3 years ago
3d-vista / 3D-VisTA
View on GitHub
Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"
☆215Sep 7, 2023Updated 2 years ago
referit3d / referit3d
View on GitHub
Code accompanying our ECCV-2020 paper on 3D Neural Listeners.
☆140Jun 29, 2021Updated 4 years ago
CurryYuan / PhraseRefer
View on GitHub
[TNNLS] Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
☆16Jul 10, 2025Updated 10 months ago
daveredrum / Scan2Cap
View on GitHub
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
☆106Sep 6, 2022Updated 3 years ago
matthewdm0816 / BridgeQA
View on GitHub
[AAAI 24] Official Codebase for BridgeQA: Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
☆27Jul 12, 2024Updated last year
hanhung / TGNN
View on GitHub
☆25Mar 15, 2022Updated 4 years ago
sega-hsj / MVT-3DVG
View on GitHub
[CVPR 2022] Multi-View Transformer for 3D Visual Grounding
☆81Nov 9, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
daveredrum / ScanRefer
View on GitHub
[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
☆301Feb 10, 2023Updated 3 years ago
yanx27 / CLEVR3D
View on GitHub
CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
☆20Feb 2, 2024Updated 2 years ago
Chat-3D / Chat-3D
View on GitHub
Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"
☆57Mar 28, 2024Updated 2 years ago
jianghaojun / Awesome-3D-Vision-and-Language
View on GitHub
A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.
☆101Feb 26, 2023Updated 3 years ago
evelinehong / 3D-CLR-Official
View on GitHub
[CVPR 2023] Code for "3D Concept Learning and Reasoning from Multi-View Images"
☆85Jan 20, 2024Updated 2 years ago
leolyj / 3D-VLP
View on GitHub
This is the code related to "Context-aware Alignment and Mutual Masking for 3D-Language Pre-training" (CVPR 2023).
☆29Jun 15, 2023Updated 2 years ago
UMass-Embodied-AGI / 3D-LLM
View on GitHub
Code for 3D-LLM: Injecting the 3D World into Large Language Models
☆1,195Jun 6, 2024Updated last year
ZzZZCHS / Chat-Scene
View on GitHub
[NeurIPS 2024 & TPAMI 2026] Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
☆211Apr 12, 2026Updated last month
PNXD / FFL-3DOG
View on GitHub
Free-form Description-guided 3D Visual Graph Networks for Object Grounding in Point Cloud
☆17Jun 23, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Asterisci / Language-Assisted-3D
View on GitHub
[AAAI 2023 Oral] Language-Assisted 3D Feature Learning for Semantic Scene Understanding
☆12Aug 1, 2023Updated 2 years ago
fjhzhixi / 3D-SPS
View on GitHub
☆64May 17, 2023Updated 3 years ago
ATR-DBI / CityRefer
View on GitHub
☆47Jan 17, 2024Updated 2 years ago
InternRobotics / EmbodiedScan
View on GitHub
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
☆665Jun 13, 2025Updated 11 months ago
scene-verse / SceneVerse
View on GitHub
Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"
☆282Mar 19, 2025Updated last year
zlccccc / 3DVG-Transformer
View on GitHub
[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds
☆43Jul 6, 2022Updated 3 years ago
CurryYuan / InstanceRefer
View on GitHub
[ICCV 2021] InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextua…
☆74Mar 22, 2025Updated last year
CurryYuan / X-Trans2Cap
View on GitHub
[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
☆36Aug 26, 2022Updated 3 years ago
OpenM3D / M3DBench
View on GitHub
[ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.
☆61Oct 1, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ActiveVisionLab / Awesome-LLM-3D
View on GitHub
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
☆2,199Apr 16, 2026Updated last month
Open3DA / LL3DA
View on GitHub
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Langu…
☆317Jul 17, 2024Updated last year
ZCMax / ScanReason
View on GitHub
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
☆83Oct 10, 2024Updated last year
InternRobotics / Grounded_3D-LLM
View on GitHub
Code&Data for Grounded 3D-LLM with Referent Tokens
☆134Jan 5, 2025Updated last year
Leon1207 / 3DRefTR
View on GitHub
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
☆26Aug 24, 2023Updated 2 years ago
ZCMax / LLaVA-3D
View on GitHub
[ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
☆378Oct 21, 2025Updated 7 months ago
zyang-ur / SAT
View on GitHub
SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)
☆32Sep 29, 2021Updated 4 years ago