Meituan-AutoML / MobileVLM
Strong and Open Vision Language Assistant for Mobile Devices
☆1,213Updated last year
Alternatives and similar repositories for MobileVLM
Users that are interested in MobileVLM are comparing it to the libraries listed below
Sorting:
- A family of lightweight multimodal models.☆1,016Updated 5 months ago
- InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions☆2,826Updated 3 weeks ago
- LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)☆806Updated 9 months ago
- A Framework of Small-scale Large Multimodal Models☆817Updated 2 weeks ago
- Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks☆2,358Updated this week
- Next-Token Prediction is All You Need☆2,115Updated last month
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆740Updated last year
- Mixture-of-Experts for Large Vision-Language Models☆2,154Updated 5 months ago
- Emu Series: Generative Multimodal Models from BAAI☆1,720Updated 7 months ago
- 【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment☆805Updated last year
- VisionLLM Series☆1,059Updated 2 months ago
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)