GigaAI-research / VLA-R1Links
☆19Updated this week
Alternatives and similar repositories for VLA-R1
Users that are interested in VLA-R1 are comparing it to the libraries listed below
Sorting:
- Project Page for GaussianFormer☆24Updated last year
- [NeurIPS 2025]Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency☆62Updated last month
- [ICCV 2025] IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation☆45Updated 2 months ago
- ☆24Updated 4 months ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆115Updated 4 months ago
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆29Updated 3 weeks ago
- LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians☆16Updated 9 months ago
- [ICCV 2025] Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model☆83Updated 10 months ago
- Unifying 2D and 3D Vision-Language Understanding☆109Updated 2 months ago
- Nav-R1: Reasoning and Navigation in Embodied Scenes☆58Updated 2 weeks ago
- [CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation☆122Updated this week
- Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving☆29Updated this week
- ☆54Updated 4 months ago
- [ECCV24] Navigation Instruction Generation with BEV Perception and Large Language Models☆30Updated last year
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆56Updated last year
- Open-Vocabulary SAM3D: Understand Any 3D Scene☆33Updated 4 months ago
- [Arxiv'25] MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization☆48Updated last month
- [ICCV 2025] Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding☆61Updated 9 months ago
- ☆14Updated 4 months ago
- [ICCV 2025] 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.☆82Updated 2 months ago
- [RAL 2024] OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding☆28Updated 8 months ago
- ☆99Updated 10 months ago
- Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING☆31Updated last year
- ☆47Updated last year
- 4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration☆38Updated 3 months ago
- CVPR 2025: VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction☆58Updated 2 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆167Updated 4 months ago
- This repository is the official implementation of our paper (From reactive to cognitive: brain-inspired spatial intelligence for embodied…☆61Updated last month
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆42Updated 10 months ago
- [CVPR 2024] Memory-based Adapters for Online 3D Scene Perception☆120Updated 6 months ago