☆12Jan 10, 2025Updated last year
Alternatives and similar repositories for Spatial-MM
Users that are interested in Spatial-MM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.☆143Mar 25, 2023Updated 3 years ago
- 这个仓库包含了我在上人工智能课时完成的拼音输入法作业。☆11Feb 16, 2022Updated 4 years ago
- A CustomNet node for ComfyUI☆10Aug 11, 2024Updated last year
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆36Jul 15, 2025Updated 9 months ago
- ☆84Nov 5, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code release for "Understanding Bias in Large-Scale Visual Datasets"☆23Dec 4, 2024Updated last year
- [ICLR 2025 Oral] Official Implementation for "Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Un…☆21Oct 24, 2024Updated last year
- [CVPR 2026 Fingdings] This repo is the official implementation of "Euclid’s Gift: Enhancing Spatial Perception and Reasoning in Vision‑La…☆28Mar 15, 2026Updated last month
- A curated lists of self-taught materials including research blogs☆16Dec 12, 2016Updated 9 years ago
- ☆14Dec 17, 2018Updated 7 years ago
- ☆31Jun 25, 2024Updated last year
- Experiments with representation engineering☆14Feb 28, 2024Updated 2 years ago
- [NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"☆320Dec 14, 2024Updated last year
- Computer Network : A Top-Down Approach 8th Resource and Homework☆15Apr 23, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Deep learning and spectral embedding for graph partitioning☆14May 13, 2022Updated 3 years ago
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆21Oct 15, 2024Updated last year
- Official Github repository for paper "Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates"☆14Mar 22, 2024Updated 2 years ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆102Nov 30, 2025Updated 5 months ago
- An API that detect expiration date from the product package's picture based on Deep Learning Algorithms☆11Jun 4, 2022Updated 3 years ago
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆19Jan 23, 2018Updated 8 years ago
- text classification compitioin top 10 strategy☆18Aug 14, 2021Updated 4 years ago
- Building Llama 3 from scratch using PyTorch☆13Sep 1, 2024Updated last year
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆110Jul 9, 2025Updated 9 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆11Dec 20, 2024Updated last year
- ☆102Jan 27, 2026Updated 3 months ago
- Standardization Project for mjai Format Specification☆12Aug 28, 2024Updated last year
- [NeurIPS24] VisMin: Visual Minimal-Change Understanding☆19Mar 3, 2025Updated last year
- Official Pytorch Code Implementation for "UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control", accepted by ICML 2…☆34Jan 30, 2026Updated 3 months ago
- ☆12Jan 31, 2023Updated 3 years ago
- ☆47Nov 8, 2024Updated last year
- Codes and data for AAAI-24 paper "Advancing Spatial Reasoning in Large Language Models: An In-depth Evaluation and Enhancement Using the …☆14Apr 23, 2024Updated 2 years ago
- Python script that splits videos into individual frames.☆11Dec 21, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆14Jun 22, 2022Updated 3 years ago
- TongjiThesis Docker 环境 | Docker environment for TongjiThesis (Tongji University thesis LaTeX template)☆12Mar 28, 2026Updated last month
- 日麻牌理分析☆11Feb 9, 2026Updated 2 months ago
- dmps code☆39Jan 24, 2024Updated 2 years ago
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆71Feb 28, 2024Updated 2 years ago
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆33Apr 27, 2025Updated last year