allenai / olmocrLinks
Toolkit for linearizing PDFs for LLM datasets/training
☆13,781Updated this week
Alternatives and similar repositories for olmocr
Users that are interested in olmocr are comparing it to the libraries listed below
Sorting:
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆42,085Updated this week
- OCR & Document Extraction using vision models☆11,772Updated 3 months ago
- A simple screen parsing tool towards pure vision based GUI agent☆23,326Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,785Updated 6 months ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆8,375Updated 7 months ago
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆6,809Updated last month
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,337Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆10,976Updated 3 weeks ago
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆17,853Updated 3 weeks ago
- KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning a…☆7,661Updated last week
- Python tool for converting files and office documents to Markdown.☆71,877Updated last week
- No fortress, purely open ground. OpenManus is Coming.☆49,003Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆27,942Updated this week
- 🖥️ Run AI Agent in your browser.☆14,598Updated last week
- The python library for real-time communication☆4,210Updated this week
- 🪄 Create rich visualizations with AI☆13,293Updated this week
- A visual playground for agentic workflows: Iterate over your agents 10x faster☆5,358Updated last month
- Vision agent☆5,007Updated 2 weeks ago
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆3,377Updated this week
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,666Updated 2 months ago
- Use your locally running AI models to assist you in your web browsing☆6,998Updated last week
- The Open-sourced Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.☆16,504Updated this week
- RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.☆62,444Updated this week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆24,091Updated last week
- StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language mo…☆3,991Updated 4 months ago
- The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.☆4,044Updated this week
- A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.☆5,298Updated 3 months ago
- Task-Aware Agent-driven Prompt Optimization Framework☆3,486Updated 2 weeks ago
- Build Real-Time Knowledge Graphs for AI Agents☆16,924Updated this week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,094Updated 6 months ago