GigaAI-research / SwiftVLALinks
☆54Updated 2 months ago
Alternatives and similar repositories for SwiftVLA
Users that are interested in SwiftVLA are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding☆70Updated last year
- [CVPR2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos☆192Updated 4 months ago
- [ICCV 2025] IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation☆63Updated 6 months ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆129Updated 8 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆172Updated 7 months ago
- [NeurIPS 2025]Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency☆76Updated 4 months ago
- [AAAI 2026] WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving☆29Updated last month
- Official implementation of "From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction"☆59Updated 2 months ago
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆286Updated last month
- Evo-0: Vision-Language-Action Model with Implicit Spatial Understanding.☆53Updated 2 months ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆176Updated 7 months ago
- This repository is the official implementation of our paper (From reactive to cognitive: brain-inspired spatial intelligence for embodied…☆76Updated 3 months ago
- Official implementation of "Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation" (NeurIPS'25 Oral)☆75Updated last month
- Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer☆28Updated 3 months ago
- 4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration. Accepted to NeurIPS 2025.☆47Updated last month
- Nav-R1: Reasoning and Navigation in Embodied Scenes☆110Updated 3 months ago
- ☆234Updated 6 months ago
- Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models☆146Updated last month
- ☆87Updated 8 months ago
- Official implementation of ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver.☆204Updated 2 weeks ago
- [CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding☆208Updated 9 months ago
- Official implementation of T-PAMI25 paper "M²Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes"☆108Updated 7 months ago
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆228Updated last month
- [CVPR 2024] Memory-based Adapters for Online 3D Scene Perception☆125Updated 10 months ago
- [ICLR 2026] Codebase for paper "Geometry-aware 4D Video Generation for Robot Manipulation"☆72Updated last month
- ☆82Updated 6 months ago
- Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model☆180Updated last month
- EO: Open-source Unified Embodied Foundation Model Series☆48Updated 3 weeks ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆78Updated 2 months ago
- Official implementation for "SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation"☆54Updated 2 months ago