ShijieZhou-UCLA/VLM4D

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ShijieZhou-UCLA/VLM4D)

ShijieZhou-UCLA / VLM4D

[ICCV 2025] VLM4D: Towards Spatiotemporal Awareness in Vision Language Models

☆55

Alternatives and similar repositories for VLM4D

Users that are interested in VLM4D are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ShijieZhou-UCLA / Feature4X
View on GitHub
[CVPR 2025] Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
☆41Oct 18, 2025Updated 9 months ago
TencentARC / DSR_Suite
View on GitHub
☆74Apr 21, 2026Updated 3 months ago
InternRobotics / G2VLM
View on GitHub
[CVPR 2026] G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
☆346Apr 18, 2026Updated 3 months ago
Dynamics-X / DynamicVerse
View on GitHub
[NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"
☆101Dec 21, 2025Updated 7 months ago
ShijieZhou-UCLA / DreamScene360
View on GitHub
[ECCV 2024] DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting
☆130May 13, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
VITA-Group / VLM-3R
View on GitHub
[CVPR 2026] VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
☆431Jul 15, 2026Updated 2 weeks ago
PeiwenSun2000 / SpaceVista
View on GitHub
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
☆43May 26, 2026Updated 2 months ago
InternRobotics / OST-Bench
View on GitHub
[NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
☆80Sep 29, 2025Updated 10 months ago
Zhangyr2022 / MoGe4D
View on GitHub
[ECCV 2026] Geometry-Aware Single-Image 4D Synthesis via Dense Trajectory Generation
☆69Jul 11, 2026Updated 2 weeks ago
Yoonkyo / TraceForge
View on GitHub
Official code for "TraceGen: World Modeling in 3D Trace-Space Enables Learning from Cross-Embodiment Videos" (CVPR 2026)
☆18Jan 31, 2026Updated 5 months ago
RuijieZhu94 / HABins
View on GitHub
[ECCVW 2022 & TCSVT 2023] HA-Bins: Hierarchical Adaptive Bins for Robust Monocular Depth Estimation across Multiple Datasets. 2nd place i…
☆11Jun 6, 2024Updated 2 years ago
GVCLab / MLLM-4D
View on GitHub
[ICML 2026] MLLM-4D: Towards Visual-based Spatial-Temporal Intelligence
☆36May 1, 2026Updated 2 months ago
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated last year
HavenFeng / St4RTrack
View on GitHub
Official Implementation of paper "St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World"
☆143Sep 18, 2025Updated 10 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yz-cnsdqz / DOMA-release
View on GitHub
official implementation of [Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories, CVPR'24]
☆16Oct 23, 2025Updated 9 months ago
Hoyyyaard / 3DFlowAction
View on GitHub
☆62Jul 6, 2025Updated last year
yliu-cs / SSR
View on GitHub
[NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
☆40Oct 14, 2025Updated 9 months ago
haoz19 / MagicPose4D
View on GitHub
Code for MagicPose4D: Crafting Articulated Models with Appearance and Motion Control
☆110Oct 8, 2024Updated last year
mengcaopku / SpatialDreamer
View on GitHub
SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
☆15Feb 1, 2026Updated 5 months ago
RuijieZhu94 / ER-Depth
View on GitHub
[TOMM 2025] ER-Depth: Enhancing the Robustness of Self-Supervised Monocular Depth Estimation in Challenging Scenes
☆13Jan 12, 2026Updated 6 months ago
mll-lab-nu / Awesome-Spatial-Intelligence-in-VLM
View on GitHub
A paper list for spatial reasoning
☆767Jan 19, 2026Updated 6 months ago
ShijieZhou-UCLA / feature-3dgs
View on GitHub
[CVPR 2024 Highlight] Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
☆676Oct 17, 2024Updated last year
skwak-kaist / MoDec-GS
View on GitHub
[CVPR2025] MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting
☆41Jul 31, 2025Updated 11 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
Biscue5 / EgoScaler
View on GitHub
[CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
☆48Dec 2, 2025Updated 7 months ago
Shadow-Dream / Reaction-Graph
View on GitHub
[ICML 2025] Official implementation of Reaction Graph: Towards Reaction-Level Modeling for Chemical Reactions with 3D Structures
☆16Sep 4, 2025Updated 10 months ago
vision-x-nyu / thinking-in-space
View on GitHub
Official repo and evaluation implementation of VSI-Bench
☆734Aug 5, 2025Updated 11 months ago
RuijieZhu94 / TI-Face
View on GitHub
[ICCVW 2023] TIFace: Improving Facial Reconstruction through Tensorial Radiance Fields and Implicit Surfaces. 1st place at VSCHH @ ICCV 2…
☆18Dec 20, 2023Updated 2 years ago
nv-tlabs / L4GM-official
View on GitHub
[NeurIPS 2024] L4GM: Large 4D Gaussian Reconstruction Model
☆257Jan 24, 2025Updated last year
paintscene4d / paintscene4d.github.io
View on GitHub
☆25Mar 30, 2025Updated last year
IGL-HKUST / TrackingWorld
View on GitHub
[NeurIPS 25] TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
☆193Dec 25, 2025Updated 7 months ago
cambrian-mllm / cambrian-s
View on GitHub
Cambrian-S: Towards Spatial Supersensing in Video
☆563Apr 3, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
mll-lab-nu / Theory-of-Space
View on GitHub
THEORY OF SPACE: a benchmark for evaluating whether foundation models can actively explore under partial observability efficiently to bui…
☆85Feb 27, 2026Updated 5 months ago
yukangcao / Awesome-4D-Spatial-Intelligence
View on GitHub
A curated list of awesome papers for reconstructing 4D spatial intelligence from video. (arXiv 2507.21045)
☆514Jun 5, 2026Updated last month
chenj02 / DASH
View on GitHub
[ICCV 2025] DASH: 4D Hash Encoding with Self-Supervised Decomposition for Real-Time Dynamic Scene Rendering
☆27Apr 13, 2026Updated 3 months ago
adobe-research / Can3Tok
View on GitHub
Official code for the paper: Can3Tok (ICCV2025)
☆50Aug 29, 2025Updated 11 months ago
Astrorix / MARG
View on GitHub
[IEEE T-RO 2025] MAstering Risky Gap Terrains for Legged Robots with Elevation Mapping
☆19Dec 17, 2025Updated 7 months ago
SooLab / MVTokenFlow
View on GitHub
[ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
☆27Apr 9, 2025Updated last year
TRAILab / GeneralObjectMapping
View on GitHub
Official Code Repository for the CoRL 2024 Paper: "Toward General Object-Level Mapping from Sparse Views with 3D Diffusion Priors"
☆32Jan 7, 2025Updated last year