SanMumumu / FlowRAMLinks
[2025CVPR] FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation
☆50Updated 2 months ago
Alternatives and similar repositories for FlowRAM
Users that are interested in FlowRAM are comparing it to the libraries listed below
Sorting:
- Official code release for paper "Robo-Imagine: A Robotic Video Generation Model, For Autoregressive Long-Term Task Video Generation With …☆28Updated 6 months ago
- This is a project about visual spatial reasoning.☆89Updated 3 weeks ago
- TorchHook: A PyTorch hooks manager, providing convenient interfaces to capture feature maps and debug models.☆13Updated 4 months ago
- [PG 2025] BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion☆57Updated last week
- [CVPR 2025, All Strong Accept] TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding☆249Updated 7 months ago
- (Preprint) ORV: 4D Occupancy-centric Robot Video Generation.☆76Updated 2 months ago
- GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models☆483Updated 4 months ago
- EO: Open-source Unified Embodied Foundation Model Series☆287Updated 2 months ago
- vue3-elementPlus-admin,vue3-elementPlus-template☆59Updated 2 months ago
- [ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation☆227Updated 6 months ago
- ☆58Updated 7 months ago
- [NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving…☆581Updated 4 months ago
- [ICRA 2026] A Unified Driving World Model for Future Generation and Perception☆136Updated 6 months ago
- ☆53Updated last month
- 🌐 3D and 4D World Modeling: A Survey☆783Updated 2 weeks ago
- [ICCV 2025] Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives☆229Updated last month
- G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning☆257Updated 2 weeks ago
- [ACMMM 2025] Officially implement of the paper "DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompti…☆215Updated 8 months ago
- A benchmark evaluates LLMs' performance in automating drawing revision tasks.☆56Updated last month
- [NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D☆200Updated last month
- 🔥 The first open-sourced diffusion vision-langauge-action model.☆159Updated 3 weeks ago
- 用户面试平台☆24Updated 6 months ago
- [CoRLW 2025 (Oral), IASEAI 2026] Implementation for "Challenger: Affordable Adversarial Driving Video Generation"☆137Updated last month
- [ICCV23] Bird’s-Eye-View Scene Graph for Vision-Language Navigation☆122Updated last year
- OmniNWM: Omniscient Navigation World Models for Autonomous Driving☆269Updated 3 months ago
- 🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future☆278Updated 2 weeks ago
- Official code of Motus: A Unified Latent Action World Model☆597Updated 3 weeks ago
- a comprehensive and critical synthesis of the emerging role of GenAI across the full autonomous driving stack☆225Updated 4 months ago
- [CVPR24] Volumetric Environment Representation for Vision-Language Navigation☆137Updated last year
- [CVPR 2025] UniScene: Unified Occupancy-centric Driving Scene Generation☆550Updated 2 weeks ago