WU-CVGL / GS-ReasonerLinks
Reasoning in Space via Grounding in the World
☆43Updated 2 months ago
Alternatives and similar repositories for GS-Reasoner
Users that are interested in GS-Reasoner are comparing it to the libraries listed below
Sorting:
- The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'☆196Updated 2 months ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆76Updated last month
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction☆329Updated 4 months ago
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding☆58Updated 5 months ago
- [NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding☆100Updated 11 months ago
- [NeurIPS 2025 Spotlight] Official implementation of the SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alig…☆156Updated 4 months ago
- Unifying 2D and 3D Vision-Language Understanding☆119Updated 6 months ago
- Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models☆81Updated 2 weeks ago
- SceneFun3D ToolKit☆166Updated 9 months ago
- Official implementation of paper "Pyramid Diffusion for Fine 3D Large Scene Generation" (ECCV 2024 Oral)☆132Updated 9 months ago
- Official implementation of Video-DPM☆134Updated last week
- [ICLR 2025] Official code of "Segment any 3D Object with Language"☆62Updated 3 months ago
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", ICCV 2025.☆101Updated 3 months ago
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆75Updated 3 weeks ago
- OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling☆417Updated 3 weeks ago
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆60Updated 10 months ago
- Official code of DMA: Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding, ECCV 2024☆31Updated last year
- Official implementation of DepthLM☆290Updated this week
- [CVPR 2024 Highlight] GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding☆26Updated last year
- [CVPR 2025] 3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer☆86Updated 8 months ago
- [ACM MM 2025] EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Sampler☆25Updated 5 months ago
- Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image☆63Updated last month
- Official implementation of EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting☆54Updated 7 months ago
- [CVPR 2025 Highlight🔥] Official code repository for "Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuni…☆126Updated 2 months ago
- [NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding☆144Updated last month
- [CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.☆196Updated 7 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆172Updated 7 months ago
- "VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames"☆91Updated 6 months ago
- [ICML2025 Oral] ReferSplat: Referring Segmentation in 3D Gaussian Splatting☆136Updated 4 months ago
- Official code for paper: "RayRoPE: Projective Ray Positional Encoding for Multi-view Attention"☆34Updated last week