OpenBMB / MiniCPM-o
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
☆18,531Updated this week
Alternatives and similar repositories for MiniCPM-o:
Users that are interested in MiniCPM-o are comparing it to the libraries listed below
- MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.☆7,213Updated 3 months ago
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆5,852Updated 3 weeks ago
- Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.☆15,511Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆7,531Updated this week
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆26,234Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆7,908Updated this week
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆7,059Updated last month
- Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥☆30,529Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆41,097Updated this week
- Agno is a lightweight library for building multi-modal Agents☆19,111Updated this week
- "LightRAG: Simple and Fast Retrieval-Augmented Generation"☆11,964Updated this week
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,030Updated last month
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆16,936Updated 2 weeks ago
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆10,231Updated this week
- The Memory layer for AI Agents☆24,688Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆21,480Updated 6 months ago
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆26,764Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆16,314Updated this week
- Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.☆4,479Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆22,523Updated this week
- KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning a…☆5,287Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆20,826Updated this week
- Open-Sora: Democratizing Efficient Video Production for All☆23,372Updated this week
- Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…☆6,428Updated this week
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆4,762Updated 4 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆10,325Updated this week
- Mobile-Agent: The Powerful Mobile Device Operation Assistant Family☆3,440Updated last week
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆3,471Updated 9 months ago
- The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.☆5,470Updated 6 months ago