[NeurIPS 2025]《SD-VLM: Spatial Measuring and Understanding with Depth-encoded Vision Language Models》
☆37Dec 29, 2025Updated 2 months ago
Alternatives and similar repositories for SD-VLM
Users that are interested in SD-VLM are comparing it to the libraries listed below
Sorting:
- ☆20Oct 15, 2025Updated 4 months ago
- ImageNet3D: Towards General-Purpose Object-Level 3D Understanding☆20Dec 6, 2024Updated last year
- [ICCV 2025] Official implementation of "AD-GS: Object-Aware B-Spline Gaussian Splatting for Self-Supervised Autonomous Driving"☆35Jul 15, 2025Updated 7 months ago
- Code for paper "Half-Physics: Enabling Kinematic 3D Human Model with Physical Interactions". Coming soon.☆33Jul 31, 2025Updated 7 months ago
- ☆22Aug 17, 2024Updated last year
- A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency o…☆27Aug 7, 2025Updated 6 months ago
- [ICCAD 2024] SNNGX: Securing Spiking Neural Networks with Genetic XOR Encryption on RRAM-based Neuromorphic Accelerator☆11Feb 3, 2026Updated 3 weeks ago
- Dexterous World Models☆73Feb 22, 2026Updated last week
- Unlocking Iterative Reasoning for Any Image Editor☆89Jan 18, 2026Updated last month
- An official implementation of RoDyGS: Robust Dynamic Gaussian Splatting for Casual Videos☆37Dec 11, 2024Updated last year
- Thinking in 360°: Humanoid Visual Search in the Wild☆118Feb 15, 2026Updated 2 weeks ago
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets☆63Aug 6, 2025Updated 6 months ago
- Embodied Question Answering (EQA) benchmark and method (ICCV 2025)☆47Aug 12, 2025Updated 6 months ago
- CVPR 2025: VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction☆77Aug 1, 2025Updated 7 months ago
- ☆11May 15, 2024Updated last year
- ☆12Jun 11, 2025Updated 8 months ago
- Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning☆30Sep 29, 2025Updated 5 months ago
- ☆10Oct 5, 2022Updated 3 years ago
- Python package providing functionality and plotting for chemistry method comparison☆16Feb 28, 2024Updated 2 years ago
- Agent-to-Sim Learning Interactive Behavior from Casual Videos.☆48Oct 16, 2024Updated last year
- ☆14Feb 26, 2025Updated last year
- Source code for paper Toward Planet-Wide Traffic Camera Calibration (WACV 2024)☆16Jun 4, 2024Updated last year
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated 11 months ago
- ☆78Nov 4, 2025Updated 3 months ago
- ☆12Apr 18, 2025Updated 10 months ago
- ☆14Sep 11, 2025Updated 5 months ago
- [AAAI 2026] Official code for "Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-L…☆14Nov 17, 2025Updated 3 months ago
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 9 months ago
- Official implementation of "CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models"☆39Updated this week
- Official repository for GraphEQA☆22Sep 25, 2025Updated 5 months ago
- [ICLR 2026 Oral] Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation☆87Feb 7, 2026Updated 3 weeks ago
- [NeurIPS 2025] Streaming 3D Reconstruction with Explicit Spatial Pointer Memory☆179Sep 26, 2025Updated 5 months ago
- [RA-L] DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding☆17Apr 17, 2024Updated last year
- The official implementation of NeurlPS 2025 D&B paper: IndustryEQA: Pushing the frontiers of Embodied Question Answering in Industrial Sc…☆12Sep 25, 2025Updated 5 months ago
- ☆18Sep 25, 2025Updated 5 months ago
- ☆20Jun 3, 2025Updated 8 months ago
- Code for "AffordanceLLM: Grounding Affordance from Vision Language Models"☆14Oct 18, 2024Updated last year
- ☆13Mar 28, 2025Updated 11 months ago
- ☆15Mar 22, 2024Updated last year