Dicklesworthstone/llm_aided_ocr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Dicklesworthstone/llm_aided_ocr)

Dicklesworthstone / llm_aided_ocr

Enhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking, and markdown formatting of scanned PDFs

☆2,947

Alternatives and similar repositories for llm_aided_ocr

Users that are interested in llm_aided_ocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,176Updated this week
getomni-ai / zerox
View on GitHub
OCR & Document Extraction using vision models
☆12,261May 20, 2025Updated last year
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,212Feb 10, 2025Updated last year
datalab-to / marker
View on GitHub
Convert PDF to markdown + JSON quickly with high accuracy
☆37,994Jul 20, 2026Updated last week
allenai / olmocr
View on GitHub
Toolkit for linearizing PDFs for LLM datasets/training
☆19,209Mar 25, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
adithya-s-k / omniparse
View on GitHub
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
☆7,650Dec 12, 2025Updated 7 months ago
Cinnamon / kotaemon
View on GitHub
An open-source RAG-based tool for chatting with your documents.
☆25,666Jul 14, 2026Updated 2 weeks ago
Filimoa / open-parse
View on GitHub
Improved file parsing for LLM’s
☆3,163May 17, 2026Updated 2 months ago
Dicklesworthstone / swiss_army_llama
View on GitHub
A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for vario…
☆1,055Feb 27, 2025Updated last year
yigitkonur / api-llm-ocr
View on GitHub
PDF to markdown using vision LLMs — tables, layouts, and structure preserved
☆899Feb 21, 2026Updated 5 months ago
opendatalab / PDF-Extract-Kit
View on GitHub
A Comprehensive Toolkit for High-Quality PDF Content Extraction
☆9,811Jan 3, 2025Updated last year
CosmosShadow / gptpdf
View on GitHub
Using GPT to parse PDF
☆3,560Apr 17, 2025Updated last year
nilsherzig / LLocalSearch
View on GitHub
LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…
☆5,954Mar 24, 2026Updated 4 months ago
dottxt-ai / outlines
View on GitHub
Structured Outputs
☆15,419Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Unstructured-IO / unstructured
View on GitHub
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…
☆15,210Updated this week
InternLM / MindSearch
View on GitHub
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
☆6,906Jul 4, 2025Updated last year
lavague-ai / LaVague
View on GitHub
Large Action Model framework to develop AI Web Agents
☆6,387Jan 21, 2025Updated last year
CatchTheTornado / text-extract-api
View on GitHub
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…
☆3,152Dec 8, 2025Updated 7 months ago
dleemiller / WordLlama
View on GitHub
Things you can do with the token embeddings of an LLM
☆1,450Dec 1, 2025Updated 7 months ago
katanaml / sparrow
View on GitHub
Structured data extraction, instruction calling and agentic workflows with ML, LLM and Vision LLM
☆5,188Jun 30, 2026Updated 3 weeks ago
microsoft / graphrag
View on GitHub
A modular graph-based Retrieval-Augmented Generation (RAG) system
☆35,027Updated this week
neuml / txtai
View on GitHub
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
☆12,765Updated this week
clovaai / donut
View on GitHub
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
☆6,910Jul 11, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
X-PLUG / mPLUG-DocOwl
View on GitHub
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
☆2,411May 30, 2025Updated last year
docling-project / docling
View on GitHub
Get your documents ready for gen AI
☆63,950Updated this week
QuivrHQ / MegaParse
View on GitHub
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
☆7,410Feb 21, 2025Updated last year
opendatalab / MinerU
View on GitHub
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
☆76,160Updated this week
ocrmypdf / OCRmyPDF
View on GitHub
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆34,311Updated this week
SciPhi-AI / R2R
View on GitHub
SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.
☆7,946Nov 7, 2025Updated 8 months ago
D-Star-AI / dsRAG
View on GitHub
High-performance retrieval engine for unstructured data
☆1,589Nov 10, 2025Updated 8 months ago
raphael-seo / Versatile-OCR-Program
View on GitHub
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
☆677May 13, 2026Updated 2 months ago
567-labs / instructor
View on GitHub
structured outputs for llms
☆13,650Updated this week
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
DocumindHQ / documind
View on GitHub
Open-source platform for extracting structured data from documents using AI.
☆1,517May 15, 2025Updated last year
Nutlope / llama-ocr
View on GitHub
Document to Markdown OCR library with Llama 3.2 vision
☆2,429Jul 12, 2026Updated 2 weeks ago
Dicklesworthstone / fast_vector_similarity
View on GitHub
High-performance vector similarity library in Rust with Python bindings: Spearman, Kendall, distance correlation, Jensen-Shannon, Hoeffdi…
☆430Feb 25, 2026Updated 5 months ago
run-llama / llama_index
View on GitHub
LlamaIndex is the leading document agent and OCR platform
☆51,196Updated this week
deepdoctection / deepdoctection
View on GitHub
A Repo For Document AI
☆3,199Jun 20, 2026Updated last month
collabora / WhisperFusion
View on GitHub
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
☆1,647Jul 31, 2024Updated last year
Dataherald / dataherald
View on GitHub
Interact with your SQL database, Natural Language to SQL using LLMs
☆3,642Jul 24, 2024Updated 2 years ago