[ICCV 2025] Improving 3D Large Language Model via Robust Instruction Tuning
☆67Oct 19, 2025Updated 4 months ago
Alternatives and similar repositories for Robin3D
Users that are interested in Robin3D are comparing it to the libraries listed below
Sorting:
- ☆22Jun 5, 2025Updated 9 months ago
- ☆19Jan 1, 2023Updated 3 years ago
- [NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation☆19Dec 22, 2024Updated last year
- ☆56Oct 3, 2024Updated last year
- [ICCV 2025] 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.☆104Dec 10, 2025Updated 2 months ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆64Oct 22, 2024Updated last year
- Code for "Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers" (NeurIPS 2024)☆206Oct 20, 2025Updated 4 months ago
- [ICLR 2025] Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention☆29Feb 21, 2025Updated last year
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- [MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation☆42Dec 15, 2024Updated last year
- [CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Langu…☆311Jul 17, 2024Updated last year
- [ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World☆373Oct 21, 2025Updated 4 months ago
- This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.☆10Jan 26, 2024Updated 2 years ago
- ☆19Updated this week
- Source code of "Point Set Voting for Partial Point Clouds Analysis"☆14Jan 5, 2021Updated 5 years ago
- 😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.☆261Jan 14, 2026Updated last month
- [MM 2024] [Need only a 3090] MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors☆125Sep 11, 2024Updated last year
- StereoDETR Open Source Version☆27Nov 27, 2025Updated 3 months ago
- [NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Tracking☆13May 3, 2024Updated last year
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆62Updated this week
- Research on algorithms for garment perception, manipulation...☆12Sep 15, 2023Updated 2 years ago
- Official Implementation of Video-MA2MBA☆12Dec 3, 2024Updated last year
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆11Jul 22, 2024Updated last year
- Tracking Multiple Deformable Objects in Egocentric Videos (CVPR 2023)☆13Apr 10, 2023Updated 2 years ago
- Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"☆56Mar 28, 2024Updated last year
- Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"☆278Mar 19, 2025Updated 11 months ago
- Code&Data for Grounded 3D-LLM with Referent Tokens☆132Jan 5, 2025Updated last year
- ☆33Sep 27, 2024Updated last year
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆17Feb 13, 2025Updated last year
- This the official repository of OCL (ICCV 2023).☆26Mar 28, 2024Updated last year
- Code for "Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding" (ICCV 2023)☆14Oct 2, 2024Updated last year
- The official repository of DreamMover☆34Sep 20, 2024Updated last year
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆62Aug 3, 2024Updated last year
- ☆152Aug 23, 2023Updated 2 years ago
- [ICML 2024] LEO: An Embodied Generalist Agent in 3D World☆477Apr 20, 2025Updated 10 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆61Oct 1, 2024Updated last year
- ☆27Oct 29, 2025Updated 4 months ago
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last week