mayuelala / mayuelala.github.ioLinks
My HomePage
β12Updated last month
Alternatives and similar repositories for mayuelala.github.io
Users that are interested in mayuelala.github.io are comparing it to the libraries listed below
Sorting:
- [CVPR2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videosβ140Updated 3 weeks ago
- π A curated list of CVPR 2025 Oral paper. Total 96β51Updated 2 months ago
- β84Updated 9 months ago
- β186Updated 2 months ago
- This repository is the official implementation of our paper (From reactive to cognitive: brain-inspired spatial intelligence for embodiedβ¦β61Updated last month
- Nav-R1: Reasoning and Navigation in Embodied Scenesβ58Updated 2 weeks ago
- Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligenceβ363Updated 3 months ago
- [ICCV 2025] GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Sceneβ135Updated 3 weeks ago
- [CVPR2025] ProxyTransformation : Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Groundingβ45Updated last month
- [ARXIVβ25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Controlβ81Updated 3 months ago
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.β181Updated this week
- Official code for the paper: Depth Anything At Any Conditionβ296Updated last month
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Groundingβ114Updated 4 months ago
- [CVPR 2024 Highlight] GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understandingβ27Updated last year
- Unifying 2D and 3D Vision-Language Understandingβ109Updated 2 months ago
- [CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Groundingβ178Updated 5 months ago
- [ICCV 2025 Highlights] Large-scale photo-realistic virtual worlds for embodied AIβ194Updated last week
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstructionβ278Updated last month
- Official code for the CVPR 2025 paper "Navigation World Models".β411Updated 2 months ago
- β101Updated 6 months ago
- Evo-0: Vision-Language-Action Model with Implicit Spatial Understanding.β37Updated 3 months ago
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"β173Updated 2 weeks ago
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policyβ129Updated this week
- [arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Mapsβ68Updated 2 weeks ago
- A list of works on video generation towards world modelβ167Updated 2 months ago
- OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modelingβ374Updated last week
- A curated list of awesome exploration policy papers.β11Updated 3 months ago
- [ICCV2025] Official repository of the paper "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabularyβ¦β102Updated last week
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representationβ167Updated 4 months ago
- [ICCV 2025] MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulationβ38Updated 2 months ago