ROOT: VLM based System for Indoor Scene Understanding and Beyond
☆41Jan 22, 2025Updated last year
Alternatives and similar repositories for ROOT
Users that are interested in ROOT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [SIGGRAPH Asia 2025] The official implementation of the paper "DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinat…☆33Mar 10, 2026Updated 3 months ago
- Neuroscience Inspired Agent Reasoning Framework☆31May 19, 2025Updated last year
- [ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"☆45Feb 27, 2026Updated 3 months ago
- Introduce a novel Video Trimming (VT) task and proposes an agent-based approach (AVT) for detecting wasted footage, selecting valuable se…☆26Jan 20, 2025Updated last year
- SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning☆11May 24, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Benchmark codebase for 2D range finder based people detectors using the FROG dataset☆13Oct 20, 2025Updated 7 months ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- Code for [AAAI 2026] AffordDex: Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors☆34Dec 26, 2025Updated 5 months ago
- ReSemAct: Advancing Fine-Grained Robotic Manipulation via Semantic Structuring and Affordance Refinement☆17Jan 5, 2026Updated 5 months ago
- LVAS-Agent Code Base☆21Apr 15, 2025Updated last year
- ☆15Jun 14, 2025Updated last year
- [WACV 2025] Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge☆41Oct 29, 2024Updated last year
- Official implementation of the ECCV 2022 paper "CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillati…☆38Oct 5, 2022Updated 3 years ago
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning☆38Mar 21, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆27Jun 28, 2022Updated 3 years ago
- [RA-L] DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding☆19Apr 17, 2024Updated 2 years ago
- [IEEE RA-L & ICRA 2026] Semantic-Driven Voxel Representation for LiDAR–Inertial Odometry☆48May 27, 2026Updated 2 weeks ago
- abandoned☆11May 9, 2019Updated 7 years ago
- This repository implements computer vision for real-time chessboard detection and piece recognition. Using OpenCV and Numpy, the system p…☆15Sep 24, 2024Updated last year
- Official repository for gathering data of Revisit Human-Scene Interaction via Space Occupancy (ECCV 2024).☆30Sep 29, 2024Updated last year
- Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation☆16Mar 31, 2026Updated 2 months ago
- Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement☆10Jan 24, 2022Updated 4 years ago
- ☆13Aug 29, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆18Oct 31, 2024Updated last year
- Pixel-ImageNet☆45Feb 24, 2022Updated 4 years ago
- FSD Tesla Open-source. Real-Time Environment Reconstruction System for Autonomous Vehicles☆20Jan 8, 2026Updated 5 months ago
- Implementation of the Paper Scene-Graph ViT☆10Dec 20, 2024Updated last year
- Hands-On Tutorial on Building Multimodal RAG Systems☆14Apr 10, 2025Updated last year
- ECCV 2022: Learning Shadow Correspondence for Video Shadow Detection☆14Jul 18, 2022Updated 3 years ago
- ☆44Jul 9, 2025Updated 11 months ago
- [CVPR'2022 Oral] The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation☆32Oct 19, 2023Updated 2 years ago
- ☆62Feb 12, 2026Updated 4 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Temporal memory system for AI assistants with human-like forgetting curves. All data stored locally in human-readable formats: JSONL for …☆36Jun 8, 2026Updated last week
- A Survey on Leveraging Pre-trained Generative Adversarial Networks for Image Editing and Restoration☆17Jul 22, 2022Updated 3 years ago
- [ECCV2024] DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling☆230Dec 7, 2025Updated 6 months ago
- 2D Object Tracking for automated driving using WAYMO data.☆15Jun 17, 2020Updated 5 years ago
- A basic tutorial (theory and practicals) for Visual Place Recognition.☆17Mar 4, 2024Updated 2 years ago
- NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment☆24Mar 10, 2024Updated 2 years ago
- Gaussian Splatting for Robotic Simulation☆24May 20, 2026Updated 3 weeks ago