ROOT: VLM based System for Indoor Scene Understanding and Beyond
☆41Jan 22, 2025Updated last year
Alternatives and similar repositories for ROOT
Users that are interested in ROOT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Neuroscience Inspired Agent Reasoning Framework☆31May 19, 2025Updated last year
- [ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"☆45Feb 27, 2026Updated 2 months ago
- Extreme Rotation Estimation using Dense Correlation Volumes☆44Jan 10, 2023Updated 3 years ago
- SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning☆11May 24, 2025Updated last year
- Benchmark codebase for 2D range finder based people detectors using the FROG dataset☆13Oct 20, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICRA'25] NeuGrasp: Generalizable Neural Surface Reconstruction with Background Priors for Material-Agnostic Object Grasp Detection☆22Jan 29, 2026Updated 3 months ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- Code for [AAAI 2026] AffordDex: Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors☆32Dec 26, 2025Updated 5 months ago
- Manipulating semantic data within Python☆20Jan 14, 2025Updated last year
- ReSemAct: Advancing Fine-Grained Robotic Manipulation via Semantic Structuring and Affordance Refinement☆17Jan 5, 2026Updated 4 months ago
- LVAS-Agent Code Base☆21Apr 15, 2025Updated last year
- ☆15Jun 14, 2025Updated 11 months ago
- 吴恩达深度学习课程课后作业☆10Jan 28, 2020Updated 6 years ago
- [EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"☆15Aug 26, 2025Updated 9 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official implementation of the ECCV 2022 paper "CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillati…☆38Oct 5, 2022Updated 3 years ago
- [CoRL 2024] Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with Large Language Models☆37Dec 7, 2024Updated last year
- Corpus to accompany: "Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding"☆11Apr 11, 2025Updated last year
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning☆38Mar 21, 2025Updated last year
- [CVPR 2024] Shadows Don’t Lie and Lines Can’t Bend! Generative Models don’t know Projective Geometry...for now☆49Jun 19, 2024Updated last year
- Code for RA-L paper "One-shot Learning for Task-oriented Grasping"☆12May 9, 2024Updated 2 years ago
- ☆14Mar 23, 2024Updated 2 years ago
- ECCV2020_Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising☆12Sep 24, 2020Updated 5 years ago
- Code for REACT: Real-time Efficient Attribute Clustering and Transfer for Updatable 3D Scene Graph☆17Feb 12, 2026Updated 3 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [RA-L] DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding☆19Apr 17, 2024Updated 2 years ago
- [IEEE RA-L & ICRA 2026] Semantic-Driven Voxel Representation for LiDAR–Inertial Odometry☆47Nov 20, 2025Updated 6 months ago
- abandoned☆11May 9, 2019Updated 7 years ago
- This repository implements computer vision for real-time chessboard detection and piece recognition. Using OpenCV and Numpy, the system p…☆15Sep 24, 2024Updated last year
- CAR: Class-aware Regularizations for Semantic Segmentation (ECCV-2022)☆30Oct 26, 2022Updated 3 years ago
- Official repository for gathering data of Revisit Human-Scene Interaction via Space Occupancy (ECCV 2024).☆30Sep 29, 2024Updated last year
- The official repository for the paper "Statler: State-Maintaining Language Models for Embodied Reasoning"☆13Jun 10, 2024Updated last year
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆18Oct 31, 2024Updated last year
- EKF, EIF and SEIF for SLAM☆10Nov 16, 2018Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Pixel-ImageNet☆45Feb 24, 2022Updated 4 years ago
- FSD Tesla Open-source. Real-Time Environment Reconstruction System for Autonomous Vehicles☆20Jan 8, 2026Updated 4 months ago
- Implementation of the Paper Scene-Graph ViT☆10Dec 20, 2024Updated last year
- ECCV 2022: Learning Shadow Correspondence for Video Shadow Detection☆14Jul 18, 2022Updated 3 years ago
- Official repository of "Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion" (ACMMM 2024)☆15Oct 31, 2024Updated last year
- ☆44Jul 9, 2025Updated 10 months ago
- [CVPR'2022 Oral] The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation☆32Oct 19, 2023Updated 2 years ago