Official code of "Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding"
☆174Mar 21, 2026Updated this week
Alternatives and similar repositories for VEGA-3D
Users that are interested in VEGA-3D are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- StableWorld: Towards Stable and Consistent Long Interactive Video Generation☆86Mar 18, 2026Updated last week
- 5th CLVISION workshop at CVPR: repo for the challenge☆19May 13, 2024Updated last year
- CoV: Chain-of-View Prompting for Spatial Reasoning☆52Jan 23, 2026Updated 2 months ago
- [MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation☆42Dec 15, 2024Updated last year
- ☆23Jan 15, 2026Updated 2 months ago
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos☆25Aug 8, 2025Updated 7 months ago
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆13Apr 12, 2024Updated last year
- Official Implementation of Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training☆143Mar 13, 2026Updated last week
- [ECCV 2024] Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression☆52Sep 21, 2024Updated last year
- [CVPR 2026] Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction☆54Mar 18, 2026Updated last week
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆39Mar 11, 2026Updated 2 weeks ago
- [ICRA 2026] UniFuture: A 4D Driving World Model for Future Generation and Perception☆147Feb 26, 2026Updated 3 weeks ago
- [CVPR 2026 Findings] SwiftVGGT: A Scalable Visual Geometry Grounded Transformer for Large-Scale Scenes☆59Nov 25, 2025Updated 4 months ago
- THEORY OF SPACE: a benchmark for evaluating whether foundation models can actively explore under partial observability efficiently to bui…☆63Feb 27, 2026Updated 3 weeks ago
- Official code release for the PVSM paper: "From Rays to Projections: Better Inputs for Feed-Forward View Synthesis"☆41Jan 9, 2026Updated 2 months ago
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last month
- Offside Judgment for Soccer Matches Using Drones☆16May 27, 2021Updated 4 years ago
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆81Jan 5, 2026Updated 2 months ago
- 随意层数的BPNN神经网络java实现☆12Dec 13, 2016Updated 9 years ago
- ☆17Jan 26, 2025Updated last year
- ☆21May 15, 2025Updated 10 months ago
- [T-PAMI 2025] Scale Propagation Network for Generalizable Depth Completion☆27Apr 1, 2025Updated 11 months ago
- YAICON 3rd project page - 4D Gaussian for Head Reconstruction☆11Dec 22, 2023Updated 2 years ago
- Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM, llama.cpp, RAG, and Agentic AI.☆65Updated this week
- [ICCV 2025] PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes☆60Oct 3, 2025Updated 5 months ago
- Official repository for "Vid2World: Crafting Video Diffusion Models to Interactive World Models" (ICLR 2026), https://arxiv.org/abs/2505.…☆47Jan 27, 2026Updated last month
- [AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices☆95Nov 30, 2025Updated 3 months ago
- Physics-based Zero-Shot Video Generation☆31Oct 4, 2024Updated last year
- [ICLR 2026] NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks☆141Oct 20, 2025Updated 5 months ago
- Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning☆45Mar 6, 2026Updated 2 weeks ago
- [CVPR 2025] Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation☆45Jan 9, 2026Updated 2 months ago
- Click to Grasp takes calibrated RGB-D images of a tabletop and user-defined part instances in diverse source images as input, and produce…☆21Apr 4, 2024Updated last year
- [ICCV2025] RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation☆36Jul 21, 2025Updated 8 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆90Jul 13, 2025Updated 8 months ago
- 3D - NeRF++ Volume Rendering 시각화☆13Jan 24, 2022Updated 4 years ago
- [CVPR'25] How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions☆32Oct 5, 2025Updated 5 months ago
- GRPO Algorithm for Llava Architecture (Based on Verl)☆49May 9, 2025Updated 10 months ago
- [NeurIPS 2024] A Unified Framework for 3D Scene Understanding☆173Jul 7, 2025Updated 8 months ago
- [ICCV 2025 Highlight] VolumetricSMPL☆88Jul 27, 2025Updated 7 months ago