OpenBMB/MiniCPM-V

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenBMB/MiniCPM-V)

OpenBMB / MiniCPM-V

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

☆25,926

Alternatives and similar repositories for MiniCPM-V

Users that are interested in MiniCPM-V are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenBMB / MiniCPM
View on GitHub
MiniCPM5-1B: A SOTA 1B on-device LLM, small yet powerful.
☆9,930Jun 20, 2026Updated 3 weeks ago
OpenGVLab / InternVL
View on GitHub
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
☆10,095Sep 22, 2025Updated 9 months ago
hiyouga / LlamaFactory
View on GitHub
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
☆73,360Updated this week
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,927Aug 12, 2024Updated last year
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,306Jun 9, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆19,620Jan 30, 2026Updated 5 months ago
hpcaitech / Open-Sora
View on GitHub
Open-Sora: Democratizing Efficient Video Production for All
☆29,199Apr 9, 2026Updated 3 months ago
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆86,566Updated this week
unslothai / unsloth
View on GitHub
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
☆68,371Updated this week
FoundationAgents / MetaGPT
View on GitHub
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
☆69,425Jan 21, 2026Updated 5 months ago
stanford-oval / storm
View on GitHub
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
☆30,138Sep 30, 2025Updated 9 months ago
mem0ai / mem0
View on GitHub
Universal memory layer for AI Agents
☆61,114Updated this week
QwenLM / Qwen3
View on GitHub
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
☆27,406Jan 9, 2026Updated 6 months ago
QwenLM / Qwen-VL
View on GitHub
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
☆6,708Aug 7, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
langgenius / dify
View on GitHub
Production-ready platform for agentic workflow development.
☆149,243Updated this week
zai-org / CogVLM2
View on GitHub
GPT4V-level open-source multi-modal model based on Llama3-8B
☆2,436Mar 3, 2025Updated last year
2noise / ChatTTS
View on GitHub
A generative speech model for daily dialogue.
☆39,643Apr 10, 2026Updated 3 months ago
agno-agi / agno
View on GitHub
Build, run, and manage agent platforms.
☆41,224Updated this week
LLaVA-VL / LLaVA-NeXT
View on GitHub
☆4,708Jun 15, 2026Updated last month
modelscope / ms-swift
View on GitHub
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL…
☆14,849Updated this week
microsoft / graphrag
View on GitHub
A modular graph-based Retrieval-Augmented Generation (RAG) system
☆34,495Updated this week
microsoft / OmniParser
View on GitHub
A simple screen parsing tool towards pure vision based GUI agent
☆25,173Updated this week
QwenLM / Qwen-Agent
View on GitHub
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
☆16,813Mar 4, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,113Updated this week
myshell-ai / OpenVoice
View on GitHub
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆36,974Apr 19, 2025Updated last year
zai-org / CogVideo
View on GitHub
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
☆12,885Nov 4, 2025Updated 8 months ago
infiniflow / ragflow
View on GitHub
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…
☆85,330Updated this week
khoj-ai / khoj
View on GitHub
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …
☆35,853Jun 24, 2026Updated 3 weeks ago
lobehub / lobehub
View on GitHub
🤯 LobeHub is your Chief Agent Operator, organizing your agents into 7×24 operations by hiring, scheduling, and reporting on your entire …
☆80,494Updated this week
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,491May 1, 2026Updated 2 months ago
opendatalab / MinerU
View on GitHub
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
☆75,015Updated this week
Cinnamon / kotaemon
View on GitHub
An open-source RAG-based tool for chatting with your documents.
☆25,563Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
OpenHands / OpenHands
View on GitHub
🙌 OpenHands: AI-Driven Development
☆81,202Updated this week
ollama / ollama
View on GitHub
Get up and running with Kimi-K2.6, GLM-5.2, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
☆176,416Updated this week
zai-org / CogVLM
View on GitHub
a state-of-the-art-level open visual language model | 多模态预训练模型
☆6,742May 29, 2024Updated 2 years ago
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,154Feb 10, 2025Updated last year
hpcaitech / ColossalAI
View on GitHub
Making large AI models cheaper, faster and more accessible
☆41,419Updated this week
microsoft / autogen
View on GitHub
A programming framework for agentic AI
☆59,809Apr 15, 2026Updated 3 months ago
PKU-YuanGroup / Open-Sora-Plan
View on GitHub
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
☆12,154Mar 8, 2026Updated 4 months ago