Sta8is / DINO-ForesightLinks
Official Implementation of DINO-Foresight: Looking into the Future with DINO
☆54Updated 4 months ago
Alternatives and similar repositories for DINO-Foresight
Users that are interested in DINO-Foresight are comparing it to the libraries listed below
Sorting:
- Implementation of Zero-Shot Video Semantic Segmentation [CVPR 2025]☆50Updated 4 months ago
- Scene-Centric Unsupervised Panoptic Segmentation (CVPR 2025 Highlight)☆57Updated last month
- [NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of …☆68Updated last year
- Official code for "JAFAR: Jack up Any Feature at Any Resolution"☆142Updated last week
- [CVPR 2025] Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers☆32Updated 2 weeks ago
- ☆21Updated 3 months ago
- [ICLR 2025] Official Implementation of M3: 3D-Spatial Multimodal Memory☆167Updated 2 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆110Updated 4 months ago
- [CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation☆81Updated this week
- ☆94Updated 3 months ago
- [NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding☆94Updated 5 months ago
- [ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Ra…☆102Updated 3 months ago
- Code for the paper "AMEGO: Active Memory from long EGOcentric videos" published at ECCV 2024☆38Updated 7 months ago
- ☆43Updated 2 weeks ago
- ☆33Updated 8 months ago
- [CVPR 2024] 🏡Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning☆79Updated last year
- ☆55Updated this week
- VaViM and VaVAM: Autonomous Driving through Video Generative Modeling (official repository).☆96Updated 2 weeks ago
- Unifying 2D and 3D Vision-Language Understanding☆95Updated 3 months ago
- [ICLR 2025] Official code of "Segment any 3D Object with Language"☆49Updated 3 weeks ago
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆53Updated last week
- The official repository for paper "MLLMs Need 3D-Aware Representation Supervision for Scene Understanding"☆67Updated last month
- 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.☆66Updated 2 months ago
- Generative World Explorer☆150Updated last month
- [CVPR 2024] Probing the 3D Awareness of Visual Foundation Models☆314Updated last year
- [3DV 2025] Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model☆92Updated last month
- [ICLR 2025] Dataset and Code for Paper "Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels"☆41Updated last week
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆39Updated 7 months ago
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction☆214Updated 2 weeks ago
- [CVPR 25] Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation☆111Updated 2 weeks ago