Wayne-Mai / EgoLocLinks

For Ego4D VQ3D Task

☆22

Alternatives and similar repositories for EgoLoc

Users that are interested in EgoLoc are comparing it to the libraries listed below

Sorting:

ZCMax / ScanReason
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
☆80Updated last year
MSR3D / MSR3D
[NeurIPS 2024] Official code repository for MSR3D paper
☆69Updated last month
IT3DEgo / IT3DEgo
CVPR 2024 "Instance Tracking in 3D Scenes from Egocentric Videos"
☆19Updated last year
ATR-DBI / ScanQA
☆148Updated 2 years ago
mahtabbigverdi / Aurora-perception
☆45Updated 4 months ago
PQ3D / PQ3D
Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"
☆83Updated last year
SilongYong / SQA3D
[ICLR 2023] SQA3D for embodied scene understanding and reasoning
☆154Updated 2 years ago
VincentDENGP / 3D-LR
Can 3D Vision-Language Models Truly Understand Natural Language?
☆20Updated last year
jianghaojun / Awesome-3D-Vision-and-Language
A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.
☆101Updated 2 years ago
franciszzj / OpenPSG
[ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
☆49Updated last year
facebookresearch / VidOSC
Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)
☆35Updated last year
Visual-AI / 3DRS
[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
☆137Updated last month
joyhsu0504 / NS3D
☆43Updated 2 years ago
cshizhe / vil3dref
Official implementation of Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (NeurIPS'22).
☆66Updated 3 years ago
sg-3d / sg3d
☆54Updated last year
InternRobotics / Grounded_3D-LLM
Code&Data for Grounded 3D-LLM with Referent Tokens
☆131Updated last year
zhousheng97 / EgoTextVQA
[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
☆44Updated 6 months ago
BolinLai / LEGO
[ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…
☆39Updated 10 months ago
sapeirone / EgoPack
Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…
☆24Updated last year
YoujunZhao / OpenScan
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
☆19Updated last month
VinceOuti / Open3DVQA
☆30Updated last month
KuanchihHuang / Reason3D
[3DV 2025] Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
☆114Updated 7 months ago
Chat-3D / Chat-3D
Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"
☆56Updated last year
zlccccc / 3DVL_Codebase
[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
☆56Updated 2 years ago
InternRobotics / OV_PARTS
[NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation
☆92Updated last year
nickgkan / butd_detr
Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
☆94Updated 2 years ago
Haochen-Wang409 / ross3d
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆63Updated 5 months ago
YunzeMan / Lexicon3D
[NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
☆99Updated 11 months ago
YunzeMan / Situation3D
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
☆42Updated last year
zlccccc / 3DVG-Transformer
[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds
☆43Updated 3 years ago