worldbench / WorldLensLinks
π WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
β69Updated this week
Alternatives and similar repositories for WorldLens
Users that are interested in WorldLens are comparing it to the libraries listed below
Sorting:
- A Unified Driving World Model for Future Generation and Perceptionβ127Updated 4 months ago
- [NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3Dβ193Updated last week
- [NeurIPS 2025 Spotlight] Towards Understanding Camera Motions in Any Videoβ248Updated 3 weeks ago
- Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Viewsβ107Updated last week
- 4DNeX: Feed-Forward 4D Generative Modeling Made Easyβ801Updated this week
- Wan2.1 with Controlnetβ178Updated 8 months ago
- Are Video Models Ready as Zero-shot Reasoners?β84Updated 3 weeks ago
- G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoningβ214Updated 3 weeks ago
- [ACMMM 2025] Officially implement of the paper "DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Promptiβ¦β210Updated 7 months ago
- π₯ OneThinker: All-in-one Reasoning Model for Image and Videoβ319Updated last week
- [CVPR 2024] Official implementation of "Universal Segmentation at Arbitrary Granularity with Language Instruction"β284Updated last year
- OmniNWM: Omniscient Navigation World Models for Autonomous Drivingβ260Updated last month
- [AAAI 2026 π₯] Official implementation of "NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representation"β174Updated 4 months ago
- GigaBrain-0: A World Model-Powered Vision-Language-Action Modelβ603Updated 3 weeks ago
- Official Implementation of Puzzles: Unbounded Video-Depth Augmentation for Scalable, End-to-End 3D Reconstruction.β208Updated 3 months ago
- [Tech Report] Few-Step Distillation for Text-to-Image Generation: A Practical Guideβ132Updated this week
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mulβ¦β99Updated 4 months ago
- GigaWorld-0: World Models as Data Engine to Empower Embodied AIβ717Updated 2 weeks ago
- π₯ The first open-sourced diffusion vision-langauge-action model.β134Updated last week
- CoS: Chain-of-Shot Prompting for Long Video Understandingβ52Updated 10 months ago
- β140Updated 8 months ago
- [AAAI 2026 Oral] LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequencesβ178Updated last week
- [ICLR 2025] Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modelingβ82Updated 10 months ago
- A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.β678Updated last week
- Identity-GRPO: Optimizing Multi-Human Identity-preserving Video Generation via Reinforcement Learningβ167Updated last month
- [CoRL2024] Let Occ Flow: Self-Supervised 3D Occupancy Flow Predictionβ127Updated 2 months ago
- [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomaliesβ224Updated 8 months ago
- [ICRA 2025] PUGS: Zero-shot Physical Understanding with Gaussian Splatting.β102Updated 8 months ago
- Official implemetation of "Enhancing Close-up Novel View Synthesis via Pseudo-labeling" [AAAI 2025]β15Updated 8 months ago
- [CVPR2024] Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusionβ135Updated last year