KwanWaiPang / Awesome-VLALinks
Paper Survey for Visual Language Action
☆27Updated last week
Alternatives and similar repositories for Awesome-VLA
Users that are interested in Awesome-VLA are comparing it to the libraries listed below
Sorting:
- [CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding☆205Updated 9 months ago
- Code for Streaming 4D Visual Geometry Transformer☆802Updated 3 months ago
- ☆395Updated this week
- A curated list of awesome papers for reconstructing 4D spatial intelligence from video. (arXiv 2507.21045)☆424Updated last week
- [RA-L 2025] Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation☆137Updated 9 months ago
- [ICCV 2025 Oral] SceneSplat - Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining☆301Updated last month
- [NeurIPS 2024] SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation☆314Updated 4 months ago
- ☆227Updated 5 months ago
- [IROS 25] Dynamic 3D Gaussian Scene Graphs for Environment Adaptation☆69Updated last month
- [ICCV 2025 & ICCV 2025 RIWM Outstanding Paper] Aether: Geometric-Aware Unified World Modeling☆566Updated 3 months ago
- PyTorch implementation of paper: GaussNav: Gaussian Splatting for Visual Navigation☆184Updated last year
- Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"☆382Updated 2 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆172Updated 7 months ago
- The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'☆196Updated 2 months ago
- The modified differential Gaussian rasterization in the CVPR 2024 highlight paper: GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting.☆240Updated last year
- Code for FastVGGT: Training-Free Acceleration of Visual Geometry Transformer☆659Updated 3 weeks ago
- ☆19Updated 9 months ago
- Official implemetation of the paper "Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting".☆245Updated last year
- [CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI☆649Updated 7 months ago
- [SIGGRAPH Asia 2025 (ACM TOG)] AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views☆699Updated last month
- [CVPR 2025] UniGoal: Towards Universal Zero-shot Goal-oriented Navigation☆295Updated 4 months ago
- Official implementation of EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting☆54Updated 7 months ago
- Official implement of VGGT-Long☆768Updated 2 weeks ago
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction☆329Updated 4 months ago
- 🌟 A curate list of papers, datasets, and projects for 3D Reconstruction and Generation.☆52Updated 2 months ago
- [CVPR 2024] Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships☆143Updated last year
- [CVPR 2024] Memory-based Adapters for Online 3D Scene Perception☆125Updated 10 months ago
- [ECCV 2024] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation☆262Updated 10 months ago
- [ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World☆369Updated 3 months ago
- [CVPR 2024] GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding☆18Updated last year