tsinghua-fib-lab / UrbanLLaVALinks
[ICCV 2025] UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoing and Understanding.
☆62Updated 3 months ago
Alternatives and similar repositories for UrbanLLaVA
Users that are interested in UrbanLLaVA are comparing it to the libraries listed below
Sorting:
- [KDD 2025 D&B] CityBench: Evaluating the Capabilities of Large Language Models for Urban Tasks.☆45Updated 5 months ago
- [ACL'25 Oral] Code for the paper "UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban…☆24Updated 5 months ago
- Learning Text-Enhanced Urban Region Profiling with Contrastive Language-Image Pre-Training☆41Updated last year
- [KDD 2025 Research] CityGPT: Empowering Urban Spatial Cognition of Large Language Models.☆46Updated 5 months ago
- [ICCV 2025] The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”☆83Updated 2 months ago
- This is the official repo of OpenSatMap in NeurIPS 2024 D&B Track☆28Updated 6 months ago
- [ECCV 2024 Oral] The official implementation of paper: COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation☆10Updated last year
- [arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps☆71Updated 2 months ago
- ☆42Updated 7 months ago
- [ICML 2024] GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model☆66Updated 2 months ago
- The official implementation of the paper "UrbanWorld: An Urban World Model for 3D City Generation"☆49Updated last year
- The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (arXiv 2025)☆39Updated 7 months ago
- ☆44Updated 3 months ago
- Official implementation of the ICCV 2025 paper HoliTracer.☆36Updated last month
- ☆262Updated 4 months ago
- [NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆87Updated 5 months ago
- MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence☆51Updated this week
- ☆42Updated last year
- Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆102Updated 2 weeks ago
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆78Updated 3 months ago
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆76Updated last month
- [CVPR 2023] The models, datasets(satellite&street view) and correlative config files of OmniCity-v1.0 project.☆31Updated 9 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆177Updated 2 months ago
- ☆24Updated 4 months ago
- [NeurIPS 2024] Terra: A Multimodal Spatio-Temporal Dataset Spanning the Earth☆78Updated 2 months ago
- STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?☆34Updated 6 months ago
- The official repository of [CVPR2025] DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering☆24Updated 8 months ago
- 🏆 Official implementation of LangCoop: Collaborative Driving with Natural Language☆70Updated 3 months ago
- ☆38Updated last year
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆192Updated 2 weeks ago