NanoNets / ocr-pythonLinks
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
☆121Updated 2 years ago
Alternatives and similar repositories for ocr-python
Users that are interested in ocr-python are comparing it to the libraries listed below
Sorting:
- A clean Gradio theme with dark and light variants.☆39Updated last year
- Multimodal document parser for high quality data understanding and extraction☆84Updated last week
- ☆72Updated 11 months ago
- ☆23Updated last year
- Data extraction with LLM on CPU☆267Updated last year
- A package for visualising Chroma vector collections in 3D☆108Updated last year
- ☆124Updated 8 months ago
- Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, py…☆151Updated 2 months ago
- Data extraction with Donut ML model☆57Updated last year
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆51Updated 10 months ago
- A MCP server connecting to managed indexes on LlamaCloud☆82Updated 4 months ago
- Using GPT-4 Vision and GPT-4 Turbo, take a PDF as input and get a markdown file as output.☆98Updated 9 months ago
- Chat with PDF files with source highlights☆146Updated 11 months ago
- Open Source Note GPT. Turn your photos and images into text notes (in obsidian)☆95Updated 8 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆212Updated last week
- ☆47Updated last year
- ☆86Updated last year
- This project enhances the construction of RAG applications by addressing challenges, improving accessibility, scalability, and managing d…☆147Updated last year
- react + next.js dashboard for R2R: The most advanced AI retrieval system. Containerized, Retrieval-Augmented Generation (RAG) with a REST…☆167Updated 6 months ago
- Local-GenAI-Search is a generative search engine based on Llama 3, langchain and qdrant that answers questions based on your local files☆94Updated last year
- Extract structured text from pdfs quickly☆619Updated 4 months ago
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆29Updated 2 years ago
- Groq goes brrrrr... so had to make a basic Streamlit app you can build upon!☆83Updated 9 months ago
- Create-tsi is a generative AI RAG toolkit which generates AI Applications with low code.☆234Updated last year
- 🌉 How to deploy an open-source code LLM for your dev team☆108Updated last year
- A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted…☆161Updated 4 months ago
- Chainlit app for advanced RAG. Uses llamaparse, langchain, qdrant and models from groq.☆48Updated last year
- Visualize Different Text Splitting Methods☆300Updated 10 months ago
- Structured information extraction from documents☆318Updated last year
- A Demo of Cache-Augmented Generation (CAG) in an LLM☆109Updated 5 months ago