fudan-generative-vision / WAM-DiffLinks
WAM-Diff: A Masked Diffusion VLA Framework with MoE and Online Reinforcement Learning for Autonomous Driving
☆98Updated 3 weeks ago
Alternatives and similar repositories for WAM-Diff
Users that are interested in WAM-Diff are comparing it to the libraries listed below
Sorting:
- ☆92Updated 6 months ago
- DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving☆146Updated last month
- 🌐 Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems☆128Updated last week
- 🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World☆170Updated last month
- This is the source code for the ECCV paper "MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning"☆199Updated 3 years ago
- [CoRL2024] Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction☆128Updated 3 months ago
- WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving☆107Updated 3 weeks ago
- Mem4Nav: Boosting Vision-and-Language Navigation in Urban Environments with a Hierarchical Spatial-Cognition Long-Short Memory System☆99Updated 5 months ago
- ☆308Updated 3 months ago
- [NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models☆215Updated 2 months ago
- Official implementation of paper "Unified World Models: Memory-Augmented Planning and Foresight for Visual Navigation"☆267Updated 2 months ago
- DPO-Shift: Shifting the Distribution of Direct Preference Optimization☆59Updated 10 months ago
- hybrid sfm with VIO Pose,RGB and depth data☆52Updated 2 years ago
- ☆19Updated 8 months ago
- [EMNLP2025]Official implementation: Agent-style vision question answer in Autonomous Driving!☆134Updated 3 months ago
- A naturalistic trajectory dataset with dense driving interactions and the toolbox for driving interaction extraction.☆140Updated last month
- ☆207Updated 7 months ago
- [NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding☆351Updated last month
- ☆206Updated 3 weeks ago
- 🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future☆237Updated this week
- [ICCV 2025] Perspective-Invariant 3D Object Detection☆153Updated 3 weeks ago
- This repository contains the source code for our paper: "PrefMMT: Modeling Human Preferences in Preference-based Reinforcement Learning w…☆50Updated 10 months ago
- Official implementation for "HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environm…☆378Updated last month
- Wan2.1 with Controlnet☆180Updated 9 months ago
- 🔥 The first open-sourced diffusion vision-langauge-action model.☆151Updated last week
- Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views☆176Updated last month
- [AAAI 2026 Oral] LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences☆181Updated last month
- ☆246Updated last year
- [NeurIPS'2025] Official repository for "LiveStar: Live Streaming Assistant for Real-World Online Video Understanding"☆103Updated last month
- [AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution☆357Updated last month