deepseek-ai / DeepSeek-OCRLinks
Contexts Optical Compression
☆21,996Updated 2 months ago
Alternatives and similar repositories for DeepSeek-OCR
Users that are interested in DeepSeek-OCR are comparing it to the libraries listed below
Sorting:
- 高级软件开发技术小组作业☆24Updated last week
- The absolute trainer to light up AI agents.☆10,314Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆16,759Updated this week
- A simple yet powerful agent framework that delivers with open-source models☆4,248Updated this week
- "RAG-Anything: All-in-One RAG Framework"☆12,067Updated last week
- Tongyi Deep Research, the Leading Open-source Deep Research Agent☆17,937Updated last week
- A research prototype of a human-centered web agent☆9,594Updated last month
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆6,120Updated 3 weeks ago
- How can we build a true AI agent? Like Claude Code.☆14,374Updated last week
- ☆1,445Updated this week
- GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models☆3,768Updated 3 weeks ago
- Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, im…☆3,280Updated last week
- Managing long-term, working, and external memory with OS-level scheduling, retrieval, and updates.☆3,724Updated this week
- Kimi CLI is your next CLI agent.☆3,858Updated this week
- GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning☆2,119Updated last month
- Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆17,808Updated 2 weeks ago
- The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.☆8,651Updated last month
- Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆8,139Updated last week
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆11,912Updated this week
- Trae Agent is an LLM-based agent for general purpose software engineering tasks.☆10,545Updated 3 months ago
- Open-Source Frontier Voice AI☆20,325Updated last month
- MiroThinker is an open source deep research agent optimized for research and prediction. It achieves a 60.2% Avg@8 score on the challengi…☆5,258Updated this week
- A lightweight LMM-based Document Parsing Model☆6,437Updated last month
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆3,298Updated 6 months ago
- UltraRAG v2: A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines☆2,434Updated this week
- Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…☆3,878Updated 7 months ago
- Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models☆1,612Updated last week
- Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!☆11,367Updated last month
- Kimi K2 is the large language model series developed by Moonshot AI team☆9,818Updated 2 months ago
- "DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"☆13,905Updated last week