Sta8is / DINO-ForesightLinks
Official Implementation of DINO-Foresight: Looking into the Future with DINO
☆60Updated last week
Alternatives and similar repositories for DINO-Foresight
Users that are interested in DINO-Foresight are comparing it to the libraries listed below
Sorting:
- Official code for "JAFAR: Jack up Any Feature at Any Resolution"☆151Updated last week
- Implementation of Zero-Shot Video Semantic Segmentation [CVPR 2025]☆53Updated 6 months ago
- Scene-Centric Unsupervised Panoptic Segmentation (CVPR 2025 Highlight)☆66Updated 2 months ago
- [NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of …☆67Updated last year
- ☆24Updated 5 months ago
- [ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Ra…☆103Updated 5 months ago
- ☆101Updated last week
- [CVPR 2024] 🏡Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning☆80Updated last year
- Unifying 2D and 3D Vision-Language Understanding☆100Updated last month
- [ICLR 2025] Official Implementation of M3: 3D-Spatial Multimodal Memory☆174Updated 4 months ago
- Generative World Explorer☆154Updated 2 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆112Updated 5 months ago
- LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS☆96Updated last month
- [CVPR 2025] Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers☆35Updated last week
- Program synthesis for 3D spatial reasoning☆47Updated 2 months ago
- [CVPR 2024] Probing the 3D Awareness of Visual Foundation Models☆320Updated last year
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆18Updated 2 months ago
- Scaling Properties of Diffusion Models For Perceptual Tasks (CVPR 2025)☆42Updated 3 months ago
- [ECCV 2024] DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control☆82Updated 9 months ago
- ☆34Updated last year
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction☆247Updated 3 weeks ago
- 3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding☆248Updated this week
- [AAAI 2025] GFlow: Recovering 4D World from Monocular Video☆49Updated 3 months ago
- ☆33Updated 3 months ago
- [NeurIPS 2023] Weakly Supervised 3D Open-vocabulary Segmentation☆120Updated last year
- ☆52Updated last month
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆147Updated 3 months ago
- [ICLR'24] GTA: A Geometry-Aware Attention Mechanism for Multi-view Transformers☆140Updated 4 months ago
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆55Updated last month
- [IV 2025, Oral] Official code of "6Img-to-3D: Few-Image Large-Scale Outdoor Novel View Synthesis"☆77Updated last month