OpenBMB / MiniCPM-o
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
☆19,202Updated last month
Alternatives and similar repositories for MiniCPM-o:
Users that are interested in MiniCPM-o are comparing it to the libraries listed below
- MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.☆7,297Updated 5 months ago
- Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.☆16,646Updated last month
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆7,501Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆9,690Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,449Updated 2 months ago
- Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥☆36,949Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆44,418Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆46,735Updated this week
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆6,530Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆22,171Updated 8 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆13,215Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆23,804Updated 2 months ago
- Convert PDF to markdown + JSON quickly with high accuracy☆24,025Updated last week
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆15,516Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆24,532Updated this week
- Retrieval and Retrieval-augmented LLMs☆9,296Updated this week
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations☆13,475Updated this week
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆18,757Updated this week
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,295Updated 3 months ago
- Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…☆7,461Updated this week
- "LightRAG: Simple and Fast Retrieval-Augmented Generation"☆14,837Updated this week
- RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.☆48,818Updated this week
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆30,300Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆20,671Updated this week
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆17,846Updated 2 weeks ago
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,464Updated 10 months ago
- The official Meta Llama 3 GitHub site☆28,604Updated 2 months ago
- The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.☆5,759Updated 8 months ago
- FlashMLA: Efficient MLA decoding kernels☆11,428Updated last month
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆4,862Updated 6 months ago