kaist-cvml/3d-vlm-gd

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kaist-cvml/3d-vlm-gd)

kaist-cvml / 3d-vlm-gd

[EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation

☆36

Alternatives and similar repositories for 3d-vlm-gd

Users that are interested in 3d-vlm-gd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kaist-cvml / scribble-guided-diffusion
View on GitHub
[ICIP 2025] Scribble-Guided Diffusion for Training-free Text-to-Image Generation
☆25Oct 2, 2024Updated last year
fereenwong / cdViews
View on GitHub
official code for "3D Question Answering via only 2D Vision-Language Models"
☆23Mar 4, 2026Updated 2 months ago
hustvl / Spa3R
View on GitHub
Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning
☆49Mar 25, 2026Updated 2 months ago
wookiekim / CorrespondentDream
View on GitHub
Official PyTorch implementation of CorrespondentDream: Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences (CVPR 2024 Po…
☆19Apr 29, 2024Updated 2 years ago
johnson111788 / SpatialReasoner
View on GitHub
Training recipe for SpatialReasoner [NeurIPS 2025]
☆45Apr 5, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
KAIST-Visual-AI-Group / VG-AVS
View on GitHub
Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection
☆23Feb 5, 2026Updated 3 months ago
KAIST-Visual-AI-Group / APC-VLM
View on GitHub
[ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
☆61Sep 12, 2025Updated 8 months ago
KAIST-Visual-AI-Group / ORIGEN
View on GitHub
[NeurIPS 2025] Official code for ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
☆32Oct 17, 2025Updated 7 months ago
Visual-AI / 3DRS
View on GitHub
[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
☆157Dec 9, 2025Updated 5 months ago
kaist-cvml / part-clipseg
View on GitHub
[NeurIPS 2024] Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
☆62Dec 29, 2024Updated last year
amazon-far / deltatok
View on GitHub
[CVPR 2026 Highlight] A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens
☆117May 8, 2026Updated 3 weeks ago
hyungjin-chung / VPS
View on GitHub
☆16Sep 11, 2025Updated 8 months ago
zoezheng126 / Spatio-Temporal-LLM
View on GitHub
☆19Aug 7, 2025Updated 9 months ago
KAIST-Visual-AI-Group / GrounDiT
View on GitHub
[NeurIPS 2024] Official Implementation of GrounDiT
☆59Dec 12, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ngailapdi / SplatTalk
View on GitHub
☆45Apr 9, 2026Updated last month
ugonfor / DGQ
View on GitHub
[ICLR 2025] DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
☆19Mar 25, 2025Updated last year
KAIST-Visual-AI-Group / PDS
View on GitHub
Official Implementation of Posterior Distillation Sampling
☆93Jul 7, 2025Updated 10 months ago
SNU-VGILab / e-latentlpips
View on GitHub
Unofficial implementation of E-LatentLPIPS in Diffusion2GAN
☆20Sep 5, 2024Updated last year
KAIST-Visual-AI-Group / StochSync
View on GitHub
Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…
☆21Jun 24, 2025Updated 11 months ago
krafton-ai / DAS
View on GitHub
Official implementation for Diffusion Alignment as Sampling (DAS), ICLR'25, Spotlight
☆63Feb 12, 2025Updated last year
LaVi-Lab / VG-LLM
View on GitHub
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆236Nov 28, 2025Updated 6 months ago
YunzeMan / Situation3D
View on GitHub
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
☆44Dec 9, 2024Updated last year
kaist-cvml / DreamCatalyst
View on GitHub
[ICLR 2025] DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
☆102Jan 22, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
zdk258 / CorrCLIP
View on GitHub
[ICCV 2025 Oral] CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation
☆68Aug 1, 2025Updated 9 months ago
ananthu-aniraj / pdiscoformer
View on GitHub
[ECCV 2024 Oral] Official implementation of the paper "PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers"
☆20Apr 9, 2026Updated last month
sled-group / COMFORT
View on GitHub
[ICLR 2025 Oral] Official Implementation for "Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Un…
☆21Oct 24, 2024Updated last year
cvlab-kaist / GSD
View on GitHub
Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling
☆29Sep 17, 2024Updated last year
star-kwon / FCDM
View on GitHub
[CVPR 2026] Official repository for "Reviving ConvNeXt for Efficient Convolutional Diffusion Models"
☆63Mar 26, 2026Updated 2 months ago
lern-to-write / STC
View on GitHub
[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆65Feb 25, 2026Updated 3 months ago
snuvclab / pegasus
View on GitHub
[CVPR 2024] PEGASUS: Personalized Generative 3D Avatars with Composable Attributes
☆60Dec 30, 2024Updated last year
lyan62 / vlm-info-loss
View on GitHub
☆22Sep 16, 2025Updated 8 months ago
Yangr116 / VST
View on GitHub
Visual Spatial Tuning
☆198Mar 25, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ShijieZhou-UCLA / Feature4X
View on GitHub
[CVPR 2025] Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
☆40Oct 18, 2025Updated 7 months ago
KAIST-Visual-AI-Group / PartSTAD
View on GitHub
Official implementation of PartSTAD: 2D-to-3D Part Segmentation Task Adaptation (ECCV 2024).
☆56Nov 7, 2024Updated last year
LAARRRY / CamTrol
View on GitHub
Implementation of CamTrol: Training-free Camera Control for Video Generation
☆33Oct 2, 2025Updated 7 months ago
wufeim / imagenet3d
View on GitHub
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
☆21Dec 6, 2024Updated last year
JIA-Lab-research / LSDBench
View on GitHub
A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency o…
☆28Aug 7, 2025Updated 9 months ago
TIGER-AI-Lab / QuickVideo
View on GitHub
Quick Long Video Understanding [TMLR2025]
☆78Oct 27, 2025Updated 7 months ago
WU-CVGL / SIU3R
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation of the SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alig…
☆161Sep 25, 2025Updated 8 months ago