WeitaiKang/Robin3D

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WeitaiKang/Robin3D)

WeitaiKang / Robin3D

[ICCV 2025] Improving 3D Large Language Model via Robust Instruction Tuning

☆71

Alternatives and similar repositories for Robin3D

Users that are interested in Robin3D are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WeitaiKang / Intent3D
View on GitHub
[ICLR 2025] Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
☆29Feb 21, 2025Updated last year
WeitaiKang / SegVG
View on GitHub
[ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
☆63Oct 22, 2024Updated last year
ZzZZCHS / Chat-Scene
View on GitHub
[NeurIPS 2024 & TPAMI 2026] Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
☆216Apr 12, 2026Updated 3 months ago
sg-3d / sg3d
View on GitHub
☆55Oct 3, 2024Updated last year
Open3DA / LL3DA
View on GitHub
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Langu…
☆319Jul 17, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
3DLLM-Mem / 3DLLM-Mem
View on GitHub
☆27Jun 5, 2025Updated last year
liudaizong / Awesome-3D-Visual-Grounding
View on GitHub
😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.
☆283Jan 14, 2026Updated 6 months ago
leoli646 / Adapter-X
View on GitHub
Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision
☆11Jul 22, 2024Updated 2 years ago
ZCMax / LLaVA-3D
View on GitHub
[ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
☆387Oct 21, 2025Updated 9 months ago
qzp2018 / MCLN
View on GitHub
This is a PyTorch implementation of MCLN proposed by our paper "Multi-branch Collaborative Learning Network for 3D Visual Grounding"(ECCV…
☆27Oct 10, 2024Updated last year
kleimerTU / HumanCentricLayouts
View on GitHub
☆19Jan 1, 2023Updated 3 years ago
InternRobotics / Grounded_3D-LLM
View on GitHub
Code&Data for Grounded 3D-LLM with Referent Tokens
☆136Jan 5, 2025Updated last year
Chat-3D / Chat-3D
View on GitHub
Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"
☆57Mar 28, 2024Updated 2 years ago
ZzZZCHS / WS-3DVG
View on GitHub
[ICCV 2023] Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
☆14Oct 2, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
sosppxo / MDIN
View on GitHub
[MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation
☆43Dec 15, 2024Updated last year
sosppxo / RG-SAN
View on GitHub
[NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
☆20Dec 22, 2024Updated last year
qumengxue / RIO
View on GitHub
☆13Oct 30, 2023Updated 2 years ago
CurryYuan / ZSVG3D
View on GitHub
[CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
☆63Aug 3, 2024Updated last year
facebookresearch / univlg
View on GitHub
Unifying 2D and 3D Vision-Language Understanding
☆126Jul 2, 2026Updated 3 weeks ago
Fsoft-AIC / Z-GMOT
View on GitHub
[NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Tracking
☆12May 19, 2026Updated 2 months ago
22109095 / SimOWT
View on GitHub
This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.
☆10Jan 26, 2024Updated 2 years ago
expectorlin / CONSOLE
View on GitHub
Code of the paper "Correctable Landmark Discovery via Large Models for Vision-Language Navigation" (TPAMI 2024)
☆16Jun 7, 2024Updated 2 years ago
abdo-eldesokey / latentman
View on GitHub
This is the official repository for "LatentMan: Generating Consistent Animated Characters using Image Diffusion Models" [CVPRW 2024]
☆22Jul 21, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
HuajianUP / 360VOT
View on GitHub
Python Toolkit for 360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking, ICCV2023
☆54Jan 16, 2026Updated 6 months ago
ATR-DBI / ScanQA
View on GitHub
☆161Aug 23, 2023Updated 2 years ago
PQ3D / PQ3D
View on GitHub
Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"
☆86Aug 2, 2024Updated last year
bin123apple / InfantAgent
View on GitHub
[NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.
☆39Apr 23, 2026Updated 3 months ago
LaVi-Lab / Video-3D-LLM
View on GitHub
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆219Jun 4, 2025Updated last year
zlccccc / 3DVL_Codebase
View on GitHub
[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
☆57Jan 29, 2023Updated 3 years ago
Dantong88 / LLARVA
View on GitHub
☆64Dec 14, 2024Updated last year
embodied-generalist / embodied-generalist
View on GitHub
[ICML 2024] LEO: An Embodied Generalist Agent in 3D World
☆486Apr 20, 2025Updated last year
dzh19990407 / LBDT
View on GitHub
CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
☆24Aug 12, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
InternRobotics / VLM-Grounder
View on GitHub
[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
☆134May 22, 2025Updated last year
dk-liang / UniSeg3D
View on GitHub
[NeurIPS 2024] A Unified Framework for 3D Scene Understanding
☆179Jul 7, 2025Updated last year
Mingzhen-Huang / DETracker
View on GitHub
Tracking Multiple Deformable Objects in Egocentric Videos (CVPR 2023)
☆13Apr 10, 2023Updated 3 years ago
LinfengYuan1997 / LoSh
View on GitHub
[CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation
☆13Jun 17, 2024Updated 2 years ago
amap-cvlab / ABot-Explorer
View on GitHub
☆32May 4, 2026Updated 2 months ago
Hoyyyaard / LSceneLLM
View on GitHub
☆74Mar 29, 2025Updated last year
OpenM3D / M3DBench
View on GitHub
[ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.
☆61Oct 1, 2024Updated last year