Dicklesworthstone / llm_aided_ocr
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
β2,477Updated 6 months ago
Alternatives and similar repositories for llm_aided_ocr:
Users that are interested in llm_aided_ocr are comparing it to the libraries listed below
- Detect and extract tables to markdown and csvβ726Updated 3 weeks ago
- π¦ CHONK your texts with Chonkie β¨ - The no-nonsense RAG chunking libraryβ2,599Updated this week
- Vision model based document ingestionβ1,658Updated this week
- Improved file parsing for LLMβsβ2,814Updated 3 months ago
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing anβ¦β828Updated 4 months ago
- High-performance retrieval engine for unstructured dataβ1,169Updated this week
- Knowledge Agents and Management in the Cloudβ3,707Updated this week
- Empowering RAG with a memory-based data interface for all-purpose applications!β1,633Updated 2 months ago
- Things you can do with the token embeddings of an LLMβ1,424Updated 2 weeks ago
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documentsβ¦β2,376Updated 2 weeks ago
- RAG that intelligently adapts to your use case, data, and queriesβ2,936Updated last week
- [ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMsβ1,601Updated 3 months ago
- Document to Markdown OCR library with Llama 3.2 visionβ2,173Updated last month
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidβ¦β2,274Updated this week
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understandingβ2,104Updated last month
- The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.β4,945Updated this week
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.β953Updated 2 months ago
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other entβ¦β2,533Updated this week
- Local realtime voice AIβ2,230Updated this week
- A system for agentic LLM-powered data processing and ETLβ1,677Updated this week
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β5,533Updated this week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β2,809Updated 3 months ago
- Task-Aware Agent-driven Prompt Optimization Frameworkβ2,823Updated last month
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.β1,185Updated 4 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languagesβ16,314Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundryβ3,607Updated last week
- A simple, easy-to-hack GraphRAG implementationβ2,411Updated last month
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.β1,492Updated this week
- The Open Source Memory Layer For Autonomous Agentsβ2,000Updated 4 months ago