CurryYuan/X-Trans2Cap

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CurryYuan/X-Trans2Cap)

CurryYuan / X-Trans2Cap

[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning

☆36

Alternatives and similar repositories for X-Trans2Cap

Users that are interested in X-Trans2Cap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

daveredrum / Scan2Cap
View on GitHub
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
☆106Sep 6, 2022Updated 3 years ago
zlccccc / 3DVL_Codebase
View on GitHub
[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
☆57Jan 29, 2023Updated 3 years ago
CurryYuan / PhraseRefer
View on GitHub
[TNNLS] Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
☆17Jul 10, 2025Updated last year
heng-hw / SpaCap3D
View on GitHub
[IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)
☆21Aug 31, 2022Updated 3 years ago
daveredrum / D3Net
View on GitHub
[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
☆44Aug 27, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
hanhung / TGNN
View on GitHub
☆26Mar 15, 2022Updated 4 years ago
zlccccc / 3DVG-Transformer
View on GitHub
[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds
☆43Jul 6, 2022Updated 4 years ago
nickgkan / butd_detr
View on GitHub
Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
☆95Jun 9, 2023Updated 3 years ago
mako443 / Text2Pos-CVPR2022
View on GitHub
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
☆58Jun 17, 2022Updated 4 years ago
PNXD / FFL-3DOG
View on GitHub
Free-form Description-guided 3D Visual Graph Networks for Object Grounding in Point Cloud
☆18Jun 23, 2022Updated 4 years ago
referit3d / referit3d
View on GitHub
Code accompanying our ECCV-2020 paper on 3D Neural Listeners.
☆141Jun 29, 2021Updated 5 years ago
daveredrum / ScanRefer
View on GitHub
[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
☆303Feb 10, 2023Updated 3 years ago
HaolinLiu97 / Refer-it-in-RGBD
View on GitHub
Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021
☆42May 24, 2024Updated 2 years ago
guxiao0822 / LT-DS
View on GitHub
[ECCV 2022] Tackling Long-Tailed Category Distribution Under Domain Shifts
☆25Nov 29, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
CurryYuan / InstanceRefer
View on GitHub
[ICCV 2021] InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextua…
☆74Mar 22, 2025Updated last year
ATR-DBI / ScanQA
View on GitHub
☆161Aug 23, 2023Updated 2 years ago
Feynben / ADAS
View on GitHub
A Simple Active-and-Adaptive Baseline for Cross-Domain 3D Semantic Segmentation
☆13Dec 22, 2022Updated 3 years ago
liuzhengzhe / 3D-to-2D-Distillation-for-Indoor-Scene-Parsing
View on GitHub
CVPR 2021 Oral https://arxiv.org/abs/2104.02243
☆48Oct 30, 2023Updated 2 years ago
fjhzhixi / 3D-SPS
View on GitHub
☆64May 17, 2023Updated 3 years ago
zyang-ur / SAT
View on GitHub
SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)
☆32Sep 29, 2021Updated 4 years ago
pumpkinnan / BAN
View on GitHub
Bi-Directional Attention for Joint Instance and Semantic Segmentation in Point Clouds（BAN)
☆16Apr 27, 2021Updated 5 years ago
Likekekeke / EasyGaze3D
View on GitHub
Official repository of EasyGaze3D: Towards Effective and Flexible 3D Gaze Estimation from a Single RGB Camera
☆10Aug 3, 2023Updated 2 years ago
antao97 / SegGroup.annotator
View on GitHub
Seg-Level Label Annotator
☆25Jul 24, 2022Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
guanghuixu / AnchorCaptioner
View on GitHub
☆30May 7, 2021Updated 5 years ago
zeliu98 / Group-Free-3D
View on GitHub
Group-Free 3D Object Detection via Transformers
☆256Jun 2, 2021Updated 5 years ago
TianYafu / road-status-graph-dataset
View on GitHub
☆21Dec 10, 2020Updated 5 years ago
sega-hsj / MVT-3DVG
View on GitHub
[CVPR 2022] Multi-View Transformer for 3D Visual Grounding
☆81Nov 9, 2022Updated 3 years ago
aurooj / WSG-VQA-VLTransformers
View on GitHub
Weakly Supervised Grounding for VQA in Vision-Language Transformers
☆17May 6, 2023Updated 3 years ago
daveredrum / ScanRefer_Browser
View on GitHub
☆11Feb 1, 2023Updated 3 years ago
canqin001 / PointDAN
View on GitHub
Code of NeurIPS19 Paper "PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation".
☆135Jan 24, 2021Updated 5 years ago
ZrrSkywalker / PointCLIP
View on GitHub
[CVPR 2022] PointCLIP: Point Cloud Understanding by CLIP
☆409Nov 24, 2022Updated 3 years ago
tgxs002 / wikiscenes
View on GitHub
Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.
☆43Apr 30, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
aim-uofa / DyCo3D
View on GitHub
☆128Nov 10, 2023Updated 2 years ago
ZephyrZhuQi / ssbaseline
View on GitHub
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]
☆57Apr 5, 2022Updated 4 years ago
wbhu / BPNet
View on GitHub
[CVPR 2021 Oral]Bidirectional Projection Network for Cross Dimension Scene Understanding
☆180Jul 20, 2021Updated 5 years ago
3dlg-hcvc / multi3drefer
View on GitHub
[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects
☆98Mar 26, 2026Updated 3 months ago
NUAAXQ / MLCVNet
View on GitHub
[CVPR 2020] MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
☆123Nov 18, 2021Updated 4 years ago
facebookresearch / PointContrast
View on GitHub
Code for paper <PointContrast: Unsupervised Pretraining for 3D Point Cloud Understanding>
☆349Aug 30, 2021Updated 4 years ago
Sekunde / Pri3D
View on GitHub
[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?
☆148Dec 17, 2021Updated 4 years ago