NanoNets / ocr-pythonLinks
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
☆110Updated 2 years ago
Alternatives and similar repositories for ocr-python
Users that are interested in ocr-python are comparing it to the libraries listed below
Sorting:
- ☆66Updated 8 months ago
- Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, py…☆111Updated last week
- Open Source Note GPT. Turn your photos and images into text notes (in obsidian)☆94Updated 5 months ago
- ☆122Updated 5 months ago
- Data extraction with LLM on CPU☆268Updated last year
- Making docling agentic through MCP☆156Updated this week
- A universal Qdrant table frontend based on transformers.js☆20Updated last year
- Demo of the neural semantic search built with Qdrant☆168Updated 3 months ago
- Python-tesseract is an optical character recognition (OCR) tool for python☆156Updated 7 years ago
- A Python client for the Unstructured Platform API☆106Updated this week
- Chat with PDF files with source highlights☆144Updated 8 months ago
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆47Updated 7 months ago
- A clean Gradio theme with dark and light variants.☆36Updated last year
- Streamly - Streamlit Assistant is designed to provide the latest updates from Streamlit, generate code snippets for Streamlit widgets, an…☆100Updated last year
- Awesome LLM application repo☆85Updated 5 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆160Updated last week
- ☆113Updated 8 months ago
- Examples of integrating the OpenRouter API☆209Updated 3 months ago
- ☆96Updated this week
- Chainlit app for advanced RAG. Uses llamaparse, langchain, qdrant and models from groq.☆47Updated last year
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆28Updated 2 years ago
- Graphy v1: A Realtime GraphRAG App using Langchain, Neo4j, GPT-4o, and Streamlit.☆68Updated 10 months ago
- Multimodal document parser for high quality data understanding and extraction☆74Updated this week
- Chat with your Documents(PDF, TXT, DOCX, ODT, PPTX etc), Websites and Youtube Chat too!, CSV files. Uses langchain, Ollama, Groq, Gemini,…☆55Updated last year
- Structured information extraction from documents☆317Updated 10 months ago
- TalkNexus: Ollama Chatbot Multi-Model & RAG Interface☆62Updated 5 months ago
- Haystack and Mistral 7B RAG Implementation. It is based on completely open-source stack.☆79Updated last year
- Corrective RAG demo powerd by Ollama☆104Updated last year
- ☆110Updated last year
- PDF Table Extractor is an innovative Python project designed to tackle the challenge of extracting tables from scanned PDF documents. Lev…☆38Updated last year