Unstructured-IO / pipeline-paddleocrLinks
Pipeline for converting PDFs to raw text with PaddleOCR
☆23Updated 2 years ago
Alternatives and similar repositories for pipeline-paddleocr
Users that are interested in pipeline-paddleocr are comparing it to the libraries listed below
Sorting:
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆38Updated 6 months ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆73Updated this week
- ☆19Updated 7 months ago
- Open-source observability for your LLM application.☆53Updated 8 months ago
- ☆192Updated last week
- Data extraction with Donut ML model☆56Updated last year
- GLiNER model in a FastAPI microservice.☆45Updated 9 months ago
- Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular…☆73Updated 3 weeks ago
- Unattended Lightweight Text Classifiers with LLM Embeddings☆183Updated last year
- A tool to OCR PDFs using gen-AI models☆43Updated 3 months ago
- ☆122Updated 6 months ago
- scraping and querying documents for LLMs☆23Updated last month
- Embedding models from Jina AI☆64Updated last year
- simplifies the process of creating and managing LLM workflows.☆109Updated 10 months ago
- Convert a web page to markdown☆78Updated last year
- a series of tutorials implementing rag service with BentoML and LlamaIndex☆47Updated 8 months ago
- hotpdf is a fast PDF parsing library to extract text and find text within PDF documents built on top of pdfminer.six☆196Updated 9 months ago
- Private ChatGPT/Perplexity. Securely unlocks knowledge from confidential business information.☆72Updated 11 months ago
- ☆40Updated 2 years ago
- Web Interface for Vision Language Models Including InternVLM2☆23Updated last year
- Local Ollama with Qdrant RAG: Embed, index, and enhance models for retrieval-augmented generation. Get started with easy setup for powerf…☆21Updated last year
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆24Updated 6 months ago
- Self-host llmapi server, make it really easy for accessing LLMs !☆37Updated 2 years ago
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆29Updated 2 years ago
- A Prodigy plugin for PDF annotation☆35Updated last month
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. I…☆115Updated 2 months ago
- Split and analyze text files using langchain and streamlit☆48Updated last year
- ☆23Updated 7 months ago
- 90% of what you need for LLM app development. Nothing you don't.☆265Updated 3 weeks ago
- Excel spreadsheet crawler and table parser for data extraction and querying☆156Updated 6 months ago