Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
β682May 20, 2025Updated 10 months ago
Alternatives and similar repositories for Versatile-OCR-Program
Users that are interested in Versatile-OCR-Program are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π discover story relationshipsβ347Jun 24, 2025Updated 9 months ago
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidβ¦β2,759Updated this week
- The most accurate document search and store for building AI appsβ3,557Apr 2, 2026Updated last week
- Enhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking, and markdown formatting of scanned PDFsβ2,905Mar 22, 2026Updated 2 weeks ago
- Fully neural approach for text chunkingβ406Oct 23, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Toolkit for linearizing PDFs for LLM datasets/trainingβ17,100Mar 25, 2026Updated 2 weeks ago
- Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit β¦β362May 21, 2025Updated 10 months ago
- Fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models.β677Jul 7, 2025Updated 9 months ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―β887Dec 10, 2025Updated 3 months ago
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the webβ2,341Jun 9, 2025Updated 10 months ago
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.β1,732Dec 21, 2024Updated last year
- PDF to markdown using vision LLMs β tables, layouts, and structure preservedβ890Feb 21, 2026Updated last month
- OCR & Document Extraction using vision modelsβ12,193May 20, 2025Updated 10 months ago
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.β283Mar 2, 2026Updated last month
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Detect and extract tables to markdown and csvβ757Jan 24, 2025Updated last year
- β895May 13, 2025Updated 10 months ago
- A cache for AI agents to learn and replay complex behaviors.β758Jun 15, 2025Updated 9 months ago
- Great claude skills of everyone.β36Nov 11, 2025Updated 4 months ago
- Vision infrastructure to turn complex documents into RAG/LLM-ready dataβ2,939Sep 24, 2025Updated 6 months ago
- Secretary is an AI-powered tool that analyzes social media content from specified accounts and delivers results via WeChat. It supports cβ¦β359Aug 4, 2025Updated 8 months ago
- Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine whatβ¦β332Feb 9, 2025Updated last year
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a ligβ¦β227Dec 24, 2024Updated last year
- Transcribe PDFs with local LLMsβ819Jan 27, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- β10Feb 14, 2025Updated last year
- Have a natural, spoken conversation with AI!β3,608Jul 11, 2025Updated 8 months ago
- Animating R1's thoughts.β382Feb 17, 2025Updated last year
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extractiβ¦β2,900Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languagesβ19,557Updated this week
- A powerful document AI question-answering tool that connects to your local Ollama models. Create, manage, and interact with RAG systems fβ¦β1,094Aug 9, 2025Updated 8 months ago
- β272Nov 15, 2024Updated last year
- AI reads books: Page-by-Page PDF Knowledge Extractor & Summarizer. script performs an intelligent page-by-page analysis of PDF books, metβ¦β1,633Jan 20, 2025Updated last year
- β85Apr 1, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Convert PDF to markdown + JSON quickly with high accuracyβ33,352Updated this week
- A self-hosted API that takes a URL and returns a file with browser screenshots.β1,177Mar 9, 2025Updated last year
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaboratβ¦β449Nov 24, 2025Updated 4 months ago
- A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Officeβ¦β7,262Updated this week
- A Python library to inspect and modify the internal structure of a PDF fileβ1,011Aug 17, 2025Updated 7 months ago
- Pragmatic framework to build LLM Copilotsβ64Mar 11, 2025Updated last year
- Web scraper made for AI and simplicity in mind. It runs as a CLI that can be parallelized and outputs high-quality markdown content.β540Nov 3, 2025Updated 5 months ago