mayuelala / mayuelala.github.ioLinks

My HomePage

☆12

Alternatives and similar repositories for mayuelala.github.io

Users that are interested in mayuelala.github.io are comparing it to the libraries listed below

Sorting:

ai4ce / CityWalker
[CVPR2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
☆140Updated 3 weeks ago
yejun688 / CVPR2025_oral_paper_list
😎 A curated list of CVPR 2025 Oral paper. Total 96
☆51Updated 2 months ago
HaochenZ11 / VLA-3D
☆84Updated 9 months ago
MTU3D / MTU3D
☆186Updated 2 months ago
Heathcliff-saku / BSC-Nav
This repository is the official implementation of our paper (From reactive to cognitive: brain-inspired spatial intelligence for embodied…
☆61Updated last month
AIGeeksGroup / Nav-R1
Nav-R1: Reasoning and Navigation in Embodied Scenes
☆58Updated 2 weeks ago
diankun-wu / Spatial-MLLM
Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆363Updated 3 months ago
zjwzcx / GLEAM
[ICCV 2025] GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene
☆135Updated 3 weeks ago
pqh22 / ProxyTransformation
[CVPR2025] ProxyTransformation : Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
☆45Updated last month
KwaiVGI / RoboMaster
[ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
☆81Updated 3 months ago
InternRobotics / InternScenes
[NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.
☆181Updated this week
HVision-NKU / DepthAnythingAC
Official code for the paper: Depth Anything At Any Condition
☆296Updated last month
InternRobotics / VLM-Grounder
[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
☆114Updated 4 months ago
lifuguan / GP-NeRF
[CVPR 2024 Highlight] GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding
☆27Updated last year
facebookresearch / univlg
Unifying 2D and 3D Vision-Language Understanding
☆109Updated 2 months ago
iris0329 / SeeGround
[CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
☆178Updated 5 months ago
UnrealZoo / unrealzoo-gym
[ICCV 2025 Highlights] Large-scale photo-realistic virtual worlds for embodied AI
☆194Updated last week
VITA-Group / VLM-3R
VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
☆278Updated last month
facebookresearch / nwm
Official code for the CVPR 2025 paper "Navigation World Models".
☆411Updated 2 months ago
xianzuwu / Niagara
☆101Updated 6 months ago
MINT-SJTU / Evo-VLA
Evo-0: Vision-Language-Action Model with Implicit Spatial Understanding.
☆37Updated 3 months ago
Zhoues / RoboRefer
[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"
☆173Updated 2 weeks ago
InternRobotics / InternVLA-M1
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
☆129Updated this week
fscdc / ReasonMap
[arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
☆68Updated 2 weeks ago
ziqihuangg / Awesome-From-Video-Generation-to-World-Model
A list of works on video generation towards world model
☆167Updated 2 months ago
yangzhou24 / OmniWorld
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
☆374Updated last week
zjwzcx / awesome-spatial-exploration-policy
A curated list of awesome exploration policy papers.
☆11Updated 3 months ago
lorebianchi98 / Talk2DINO
[ICCV2025] Official repository of the paper "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary…
☆102Updated last week
HaoyiZhu / SPA
[ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
☆167Updated 4 months ago
MoMaKitchen / MoMaKitchen
[ICCV 2025] MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation
☆38Updated 2 months ago