[Awesome-Spatial-VLMs] This repository is the official, community-maintained resource for the survey paper: Spatial Intelligence in Vision-Language Models: A Comprehensive Survey;
☆64Feb 16, 2026Updated last week
Alternatives and similar repositories for Awesome-Spatial-VLMs
Users that are interested in Awesome-Spatial-VLMs are comparing it to the libraries listed below
Sorting:
- Benchmarking Multi-Image Understanding in Vision and Language Models☆12Jul 29, 2024Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- Spatial Aptitude Training for Multimodal Langauge Models☆24Feb 8, 2026Updated 3 weeks ago
- Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…☆21Jun 24, 2025Updated 8 months ago
- ☆35Apr 4, 2024Updated last year
- ☆42Jul 9, 2025Updated 7 months ago
- Official Implementation of DiffCLIP: Differential Attention Meets CLIP☆53Mar 12, 2025Updated 11 months ago
- Official codebase for "Context Aware Deep Learning for Multi Modal Depression Detection" [ICASSP 2019, Oral]☆11Dec 26, 2024Updated last year
- This is a project on visual spatial reasoning tasks-SIBench☆25Jan 12, 2026Updated last month
- [NeurIPS 2025] HoliTom: Holistic Token Merging for Fast Video Large Language Models☆71Oct 10, 2025Updated 4 months ago
- Multi-Agent LLM System for Digital Scam Protection☆12Dec 19, 2024Updated last year
- Repository for the code assignment of the Deep Learning 1 course, Fall 2021 edition☆10Oct 31, 2022Updated 3 years ago
- Colab notebooks exploring different Machine Learning topics.☆16Apr 2, 2022Updated 3 years ago
- [NeurIPS 2025] EOC-Bench, an innovative benchmark designed to systematically evaluate object-centric embodied cognition in dynamic egocen…☆22Jun 17, 2025Updated 8 months ago
- Reward fine-tuning for Stable Diffusion models based on stochastic optimal control, including Adjoint Matching☆64May 30, 2025Updated 9 months ago
- ☆20Oct 15, 2025Updated 4 months ago
- RAG Based LLM Chatbot Built using Open Source Stack (Llama 3.2 Model, BGE Embeddings, and Qdrant running locally within a Docker Containe…☆15Jan 9, 2025Updated last year
- Probabilistic Finite Volume Method based on Affine Gaussian Process inference☆11Jun 10, 2024Updated last year
- ☆10Jun 13, 2022Updated 3 years ago
- Efficient SDE samplers including Gaussian-based probabilistic solvers. Written in JAX.☆10Feb 8, 2025Updated last year
- PyTorch Implementation for InMaP☆11Oct 28, 2023Updated 2 years ago
- [ICLR 2026] Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos☆26Jan 26, 2026Updated last month
- IBM Quantum Challenge Fall 2023☆10May 23, 2023Updated 2 years ago
- ☆13Jun 4, 2025Updated 8 months ago
- Image Search Engine with HuggingFace Sentence Transformer☆12Aug 31, 2023Updated 2 years ago
- ☆14Sep 11, 2025Updated 5 months ago
- Code for NeurIPS 2024 work "MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps"☆17Dec 11, 2024Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 4 months ago
- [Arxiv 2025] Official code and datasets of paper: GNNs as Predictors of Agentic Workflow Performances☆21Jan 15, 2026Updated last month
- IJCAI 2022-Initializing Then Refining: A Simple Graph Attribute Imputation Network☆13Mar 4, 2024Updated last year
- Official Implementation of "Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning" in AAAI2024.☆13Feb 28, 2024Updated 2 years ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆14Sep 30, 2023Updated 2 years ago
- ☆13Jan 22, 2025Updated last year
- Towards Real-Time Practical Image Compression with Lightweight Attention☆14Jan 5, 2026Updated last month
- ☆22Sep 16, 2025Updated 5 months ago
- Room impulse response simulation for various array architectures using Monte-Carlo simulation and quaternions (Python)☆17Updated this week
- In this course navigates through the LLMOps pipeline, enabling you to preprocess training data for supervised fine-tuning and deploy cust…☆14Feb 13, 2024Updated 2 years ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆15Jun 3, 2025Updated 8 months ago
- (NeurIPS 2025 D&B Track) OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps☆25Jan 22, 2026Updated last month