LEO: A powerful Hybrid Multimodal LLM
☆19Jan 18, 2025Updated last year
Alternatives and similar repositories for LEO
Users that are interested in LEO are comparing it to the libraries listed below
Sorting:
- ☆10Apr 7, 2025Updated 10 months ago
- ☆13Mar 28, 2025Updated 11 months ago
- Implementation of Pix2Seq in PyTorch☆10Feb 3, 2022Updated 4 years ago
- Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels☆69Updated this week
- Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models☆21Apr 16, 2025Updated 10 months ago
- Visual Spatial Tuning☆176Feb 19, 2026Updated 2 weeks ago
- ☆24Jul 11, 2025Updated 7 months ago
- [CVPR 2025] Official code of "DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting"☆46Sep 5, 2025Updated 6 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Jul 20, 2024Updated last year
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Feb 22, 2026Updated last week
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- [CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models☆18Jul 22, 2024Updated last year
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding