hustvl / MIM4D
MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
☆62Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for MIM4D
- WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation☆82Updated 2 months ago
- ☆27Updated this week
- [ECCV 2024] Occupancy as Set of Points☆81Updated 4 months ago
- [CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries☆168Updated 4 months ago
- [ECCV 2024] Official implementation for "RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception"☆22Updated this week
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆31Updated this week
- ☆15Updated last year
- [CVPR 2024] Official PyTorch Code of SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects☆85Updated 3 weeks ago
- Official implementation for 'SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction' (CVPR 202…☆48Updated 3 months ago
- ☆105Updated 8 months ago
- ☆94Updated 4 months ago
- [IROS 2024]InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction☆24Updated 4 months ago
- [ECCV 2024] Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression☆37Updated 2 months ago
- [NeurIPS 2024] OPUS: Occupancy Prediction Using a Sparse Set☆67Updated last month
- ☆26Updated 2 months ago
- OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving☆148Updated 5 months ago
- [NeurIPS 2024] TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight☆19Updated 3 weeks ago
- [ECCV 2024] 4D Contrastive Superflows are Dense 3D Representation Learners☆41Updated 2 months ago
- [ECCV 2024] ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers☆39Updated 3 weeks ago
- Official Code Release of Delphi☆52Updated 5 months ago
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆101Updated 3 months ago
- A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios☆47Updated 5 months ago
- BEVGen☆66Updated 9 months ago
- [CVPR 2023] Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark☆73Updated last year
- [WACV'25] Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding☆44Updated 8 months ago
- MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive Reordering☆19Updated 3 months ago
- Codes for ICLR 2024: "MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection"☆65Updated 4 months ago
- ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention (ECCV 2024)☆74Updated 4 months ago
- Code release for our NeurIPS 2023 paper "Uni3DETR: Unified 3D Detection Transformer", our ECCV 2024 paper "OV-Uni3DETR: Towards Unified O…☆83Updated 3 months ago
- Official PyTorch implementation of End-to-end 3D Tracking with Decoupled Queries [ICCV 2023]☆58Updated 10 months ago