OpenBMB / MiniCPM-oLinks
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
☆19,629Updated last week
Alternatives and similar repositories for MiniCPM-o
Users that are interested in MiniCPM-o are comparing it to the libraries listed below
Sorting:
- MiniCPM4: Ultra-Efficient LLMs on End Devices, achieving 5+ speedup on typical end-side chips☆7,951Updated last week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆9,608Updated this week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆22,102Updated last week
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆18,509Updated this week
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆8,356Updated 3 weeks ago
- An open-source RAG-based tool for chatting with your documents.☆22,483Updated last week
- Memory for AI Agents; Announcing OpenMemory MCP - local and secure memory management.☆34,513Updated this week
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,394Updated 5 months ago
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆10,997Updated last month
- This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.☆11,988Updated 2 weeks ago
- "LightRAG: Simple and Fast Retrieval-Augmented Generation"☆17,437Updated last week
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆17,060Updated last week
- SOTA Open Source TTS☆21,914Updated last week
- Fully open reproduction of DeepSeek-R1☆24,819Updated 2 weeks ago
- Full-stack framework for building Multi-Agent Systems with memory, knowledge and reasoning.☆28,467Updated this week
- 🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org☆12,919Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆17,380Updated 4 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,641Updated last week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆24,598Updated last month
- No fortress, purely open ground. OpenManus is Coming.☆46,905Updated this week
- Open-Sora: Democratizing Efficient Video Production for All☆26,691Updated last month
- BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…☆8,855Updated this week
- 🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming☆56,520Updated last week
- tiny vision language model☆8,073Updated this week
- Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation☆8,495Updated 9 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,656Updated 4 months ago
- ☆7,028Updated last week
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.☆40,545Updated last week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆24,159Updated this week
- Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI☆22,357Updated 3 weeks ago