SJTU-DENG-Lab/R1-Zero-VSI

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SJTU-DENG-Lab/R1-Zero-VSI)

SJTU-DENG-Lab / R1-Zero-VSI

☆42

Alternatives and similar repositories for R1-Zero-VSI

Users that are interested in R1-Zero-VSI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

beacon-3d / Beacon3D
View on GitHub
[CVPR 2025] Beacon3D: Object-centric Evaluation for 3D Grounding-QA
☆28Nov 25, 2025Updated 8 months ago
OuyangKun10 / SpaceR
View on GitHub
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆111Jul 9, 2025Updated last year
Asterisci / Language-Assisted-3D
View on GitHub
[AAAI 2023 Oral] Language-Assisted 3D Feature Learning for Semantic Scene Understanding
☆12Aug 1, 2023Updated 2 years ago
THU-SI / Spatial-MLLM
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆480Feb 5, 2026Updated 5 months ago
ZCMax / ScanReason
View on GitHub
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
☆85Oct 10, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SJTU-DENG-Lab / LoPA
View on GitHub
LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
☆39Apr 25, 2026Updated 3 months ago
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated last month
zhoujiahuan1991 / ICML2025-GAPrompt
View on GitHub
Official implementation of paper "GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model", ICML 2025
☆17Dec 25, 2025Updated 7 months ago
LaVi-Lab / Video-3D-LLM
View on GitHub
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆219Jun 4, 2025Updated last year
SJTU-DENG-Lab / LatentUM
View on GitHub
☆57Apr 9, 2026Updated 3 months ago
zoezheng126 / Spatio-Temporal-LLM
View on GitHub
☆19Aug 7, 2025Updated 11 months ago
OPPO-Mente-Lab / FaceScore
View on GitHub
Official repo for 【FaceScore: Benchmarking and Enhancing Face Quality in Human Generation】
☆84Dec 26, 2024Updated last year
MINT-SJTU / STI-Bench
View on GitHub
STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?
☆39Jan 12, 2026Updated 6 months ago
KAIST-Visual-AI-Group / APC-VLM
View on GitHub
[ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
☆66Sep 12, 2025Updated 10 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
WeitaiKang / Intent3D
View on GitHub
[ICLR 2025] Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
☆29Feb 21, 2025Updated last year
YuyaoZhangQAQ / QCompiler
View on GitHub
This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.
☆17Oct 20, 2025Updated 9 months ago
zerolllin / Delta-L-Normalization
View on GitHub
☆16Oct 11, 2025Updated 9 months ago
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
Gabesarch / grounded-rl
View on GitHub
☆133Jul 22, 2025Updated last year
zyzkevin / dyva-worldlm
View on GitHub
☆23Nov 18, 2025Updated 8 months ago
emirhanbayar / Fast-StrongSORT
View on GitHub
StrongSORT with Selective Feature Extraction Mechanism
☆16Sep 25, 2024Updated last year
WU-CVGL / GS-Reasoner
View on GitHub
Reasoning in Space via Grounding in the World (ICLR 2025)
☆56Nov 3, 2025Updated 8 months ago
Yangr116 / VST
View on GitHub
[ECCV2026] Visual Spatial Tuning
☆200Mar 25, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SJTU-DENG-Lab / Mantis
View on GitHub
[CVPR 2026] Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
☆92Jun 5, 2026Updated last month
Ivan-Tang-3D / ENEL
View on GitHub
[ICLR 2026]The official implementation of The paper "Exploring the Potential of Encoder-free Architectures in 3D LMMs"
☆11Jan 26, 2026Updated 6 months ago
ZhangXJ199 / TinyLLaVA-Video-R1
View on GitHub
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
☆116Dec 24, 2025Updated 7 months ago
huang-yh / Owl
View on GitHub
☆52Dec 13, 2024Updated last year
aluo-x / 3D_SLN
View on GitHub
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)
☆56Dec 23, 2020Updated 5 years ago
waystogetthere / Interpretable-Transformer-Hawkes-Process
View on GitHub
☆10Jul 11, 2025Updated last year
LunarShen / DsicoVLA
View on GitHub
[CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
☆22Jun 23, 2025Updated last year
ZJU-REAL / ViewSpatial-Bench
View on GitHub
[ECCV 2026] ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models
☆82Mar 9, 2026Updated 4 months ago
ZzZZCHS / Chat-Scene
View on GitHub
[NeurIPS 2024 & TPAMI 2026] Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
☆216Apr 12, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
OpenGVLab / TPO
View on GitHub
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
☆65Jul 22, 2025Updated last year
jzh15 / SpatialStack
View on GitHub
[CVPR 2026]SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
☆31Jul 15, 2026Updated last week
kylehkhsu / tripod
View on GitHub
☆12Apr 19, 2024Updated 2 years ago
3DLLM-Mem / 3DLLM-Mem
View on GitHub
☆27Jun 5, 2025Updated last year
vision-x-nyu / thinking-in-space
View on GitHub
Official repo and evaluation implementation of VSI-Bench
☆734Aug 5, 2025Updated 11 months ago
AnjieCheng / SpatialRGPT
View on GitHub
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆336Dec 14, 2024Updated last year
agents-x-project / PyVision-RL
View on GitHub
[ICML 2026] Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."
☆70Feb 25, 2026Updated 5 months ago