NanoNets / ocr-pythonLinks
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
☆114Updated 2 years ago
Alternatives and similar repositories for ocr-python
Users that are interested in ocr-python are comparing it to the libraries listed below
Sorting:
- Data extraction with LLM on CPU☆269Updated last year
- ☆69Updated 9 months ago
- Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, py…☆137Updated 3 weeks ago
- Chat with PDF files with source highlights☆146Updated 9 months ago
- ☆23Updated last year
- Using GPT-4 Vision and GPT-4 Turbo, take a PDF as input and get a markdown file as output.☆96Updated 8 months ago
- Data extraction with Donut ML model☆56Updated last year
- Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text,…☆94Updated 10 months ago
- Web service for web page to Markdown conversion☆259Updated 6 months ago
- Multimodal document parser for high quality data understanding and extraction☆80Updated this week
- ☆112Updated last year
- Chainlit app for advanced RAG. Uses llamaparse, langchain, qdrant and models from groq.☆47Updated last year
- ☆21Updated 10 months ago
- TalkNexus: Ollama Chatbot Multi-Model & RAG Interface☆62Updated 6 months ago
- SemanticPDF: Drag, Drop, Semantic Search - SemanticPDF is a simple, privacy-focused application that makes it easy to upload a PDF file a…☆70Updated last year
- Corrective RAG demo powerd by Ollama☆105Updated last year
- Intuitive RAG system on top of LllamaIndex☆15Updated 10 months ago
- Streamly - Streamlit Assistant is designed to provide the latest updates from Streamlit, generate code snippets for Streamlit widgets, an…☆102Updated last year
- Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extr…☆217Updated last month
- Built using Open Source Stack (Llama 3.2 Model, BGE Embeddings, and Qdrant running locally within a Docker Container)☆148Updated 2 months ago
- Run CrewAI agent workflows on local LLM models with Llamafile and Ollama☆39Updated last year
- Groq goes brrrrr... so had to make a basic Streamlit app you can build upon!☆83Updated 8 months ago
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆48Updated 8 months ago
- ☆25Updated last year
- Haystack and Mistral 7B RAG Implementation. It is based on completely open-source stack.☆79Updated last year
- A RAG system designed to process documents with multimodal content. It can generate factual, context-aware answers to user queries, based…☆26Updated 9 months ago
- Open-source RAG evaluation through users' feedback☆204Updated last year
- ☆86Updated last year
- ☆124Updated 6 months ago
- like firecrawl.dev but free☆49Updated 6 months ago