ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models
☆71Mar 9, 2026Updated last week
Alternatives and similar repositories for ViewSpatial-Bench
Users that are interested in ViewSpatial-Bench are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Nov 4, 2025Updated 4 months ago
- ☆32Aug 11, 2025Updated 7 months ago
- [ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆48Feb 12, 2026Updated last month
- ☆25Aug 19, 2025Updated 7 months ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆61Nov 8, 2025Updated 4 months ago
- A curated collection of resources, tools, and frameworks for developing GUI Agents.☆328Updated this week
- [AAAI 2026] GUI-G²: Gaussian Reward Modeling for GUI Grounding☆305Feb 2, 2026Updated last month
- The officalimplement of dLLM-Factory☆26Jul 12, 2025Updated 8 months ago
- Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs☆65Jan 1, 2026Updated 2 months ago
- An Advanced Basic Math Reasoning and Overthinking Evaluation Framework for LLMs☆12Jul 8, 2025Updated 8 months ago
- ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind (AAAI2025)☆19Apr 16, 2025Updated 11 months ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆57Sep 12, 2025Updated 6 months ago
- Source code for EMNLP2022 paper "Finding Skill Neurons in Pre-trained Transformers via Prompt Tuning".☆18Mar 13, 2023Updated 3 years ago
- ☆23Feb 3, 2026Updated last month
- 基于DPO算法微调语言大模型,简单好上手。☆51Jul 3, 2024Updated last year
- Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)☆71May 2, 2025Updated 10 months ago
- HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches☆37Oct 9, 2025Updated 5 months ago
- ☆30Nov 18, 2025Updated 4 months ago
- [CVPR 2026] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence☆66Jul 9, 2025Updated 8 months ago
- [NeurIPS 2025] EOC-Bench, an innovative benchmark designed to systematically evaluate object-centric embodied cognition in dynamic egocen…☆22Jun 17, 2025Updated 9 months ago
- Repo for running various baselines with Behavior-1K☆34Nov 7, 2025Updated 4 months ago
- code for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation☆18Dec 7, 2024Updated last year
- STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?☆38Jan 12, 2026Updated 2 months ago
- Happily_Do_USTB大物实验☆23Aug 3, 2024Updated last year
- [CVPR25 Highlight] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced eval…☆31Apr 16, 2025Updated 11 months ago
- Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models☆32Nov 2, 2025Updated 4 months ago
- ☆41Jun 9, 2025Updated 9 months ago
- [ICLR 2026] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence☆80Mar 13, 2026Updated last week
- ☆82Nov 5, 2024Updated last year
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆191Sep 24, 2025Updated 5 months ago
- ☆12Oct 5, 2020Updated 5 years ago
- ☆14Dec 16, 2021Updated 4 years ago
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated last year
- ☆152Aug 23, 2023Updated 2 years ago
- Jiffy-lidar: a fast, lossless SIMD compression codec for LiDAR streams☆16Jun 14, 2023Updated 2 years ago
- Solution to the PFSP problem☆15Nov 8, 2023Updated 2 years ago
- Evaluation framework for open-domain question answering.☆20May 16, 2021Updated 4 years ago
- (ArXiv25) Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning☆59Sep 30, 2025Updated 5 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆356Jun 1, 2025Updated 9 months ago