harrytea / ROOT
ROOT: VLM based System for Indoor Scene Understanding and Beyond
☆23Updated last month
Alternatives and similar repositories for ROOT:
Users that are interested in ROOT are comparing it to the libraries listed below
- Sora Generates Videos with Stunning Geometrical Consistency☆49Updated 11 months ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆21Updated 10 months ago
- ☆38Updated last year
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆34Updated 9 months ago
- ☆34Updated 11 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆60Updated 5 months ago
- Amodal Depth Anything: Amodal Depth Estimation in the Wild☆22Updated 2 months ago
- Official Code for 'TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction'☆56Updated 2 months ago
- [ECCV2024] Official Implementation of "NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image"☆25Updated 3 months ago
- [NeurIPS2024] DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion☆30Updated 5 months ago
- Scaling Properties of Diffusion Models For Perceptual Tasks☆38Updated 4 months ago
- ☆13Updated 11 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆34Updated 2 weeks ago
- [CVPR 2024] Exploiting Diffusion Prior for Generalizable Dense Prediction☆70Updated 10 months ago
- Simple script to parallelize download and extract files for SA-1B Dataset.☆36Updated 5 months ago
- Open-Vocabulary Panoptic Segmentation☆23Updated 6 months ago
- [CVPR 2024 Highlight] PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis☆37Updated last year
- SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality☆31Updated 3 months ago
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆33Updated 10 months ago
- ☆24Updated 3 months ago
- This is the project page of ShowRoom3D☆25Updated last year
- ☆58Updated last year
- (ICLR 2024, CVPR 2024) SparseFormer☆73Updated 4 months ago
- Semantic Score Distillation Sampling for Compositional Text-to-3D Generation☆41Updated 5 months ago
- Official implementation of PARIS3D (Accepted to ECCV 2024).☆21Updated 5 months ago
- ReNeg: Learning Negative Embedding with Reward Guidance☆31Updated 2 months ago