hustvl / MIM4D
[IJCV 2025] MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
☆62Updated last year
Alternatives and similar repositories for MIM4D:
Users that are interested in MIM4D are comparing it to the libraries listed below
- [ECCV 2024] Occupancy as Set of Points☆89Updated 10 months ago
- [ECCV 2024] WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation☆103Updated 3 months ago
- ☆99Updated 5 months ago
- This is the official implementation of UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving☆40Updated this week
- [CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries☆178Updated 10 months ago
- [CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding☆136Updated last month
- [ECCV 2024] Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression☆41Updated 7 months ago
- [ICLR2025] OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework☆35Updated 3 weeks ago
- ☆90Updated 3 months ago
- Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model☆79Updated 4 months ago
- [CVPR'25] LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes☆43Updated 2 months ago
- [CVPR 2024] Official PyTorch Code of SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects☆98Updated last week
- [WACV 2025 Oral] Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding☆54Updated 2 months ago
- ☆108Updated 9 months ago
- Doe-1: Closed-Loop Autonomous Driving with Large World Model☆89Updated 3 months ago
- ☆29Updated 8 months ago
- [ECCV 2024] Official implementation for "RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception"☆30Updated last month
- Official PyTorch implementation of End-to-end 3D Tracking with Decoupled Queries [ICCV 2023]☆65Updated last year
- ☆78Updated last year
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆67Updated 5 months ago
- Official Code Release of Delphi☆55Updated 11 months ago
- [ECCV 2024] Towards Stable 3D Object Detection☆45Updated 9 months ago
- Source code for NeurIPS paper "POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images"☆106Updated 4 months ago
- Street-View Image Generation from a Bird’s-Eye View Layout: Official Codebase☆74Updated last year
- OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving☆173Updated 11 months ago
- Code release for our NeurIPS 2023 paper "Uni3DETR: Unified 3D Detection Transformer", our ECCV 2024 paper "OV-Uni3DETR: Towards Unified O…☆98Updated 9 months ago
- Codes for ICLR 2024: "MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection"☆70Updated 9 months ago
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆118Updated 2 months ago
- [CVPR 2023] Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark☆81Updated 2 years ago
- (ICCV2023) MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection☆82Updated last year