konfuzio-ai / konfuzio-sdk
Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision models tailored for your specific use cases. Find examples with code in our Tutorials section of dev.konfuzio.com and get inspiration from Use Cases section of our blog: https://konfuzio.com/en/category/marketpl…
☆61Updated this week
Alternatives and similar repositories for konfuzio-sdk:
Users that are interested in konfuzio-sdk are comparing it to the libraries listed below
- Repository for deepdoctection tutorial notebooks☆40Updated last month
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆63Updated this week
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆46Updated 5 months ago
- DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…☆19Updated 2 years ago
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆26Updated last year
- A python library for extracting text from PDFs without losing the formatting of the PDF content.☆75Updated 3 years ago
- Logical structure analysis for visually structured documents☆85Updated 2 years ago
- ☆15Updated 3 years ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated 5 months ago
- `pdfstructure` detects, splits and organizes the documents text content into its natural structure as envisioned by the author.☆102Updated 9 months ago
- Examples using the Deep Search functionalities☆56Updated this week
- ☆21Updated 10 months ago
- A Streamlit app for showing a TimelineJS about the history of Natural Language Processing☆26Updated last year
- DocLLM: A layout-aware generative language model for multimodal document understanding☆119Updated last year
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.☆34Updated 3 months ago
- 🖍️ Highlight text in documents☆99Updated 3 weeks ago
- PDF text data extraction web app with OCR for scanned documents☆83Updated 7 months ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆25Updated last year
- ☆22Updated 7 months ago
- OCRmyPDF EasyOCR plugin☆56Updated 4 months ago
- A python library to define and validate data types in Docling.☆56Updated this week
- A simple library for segmenting legal texts☆15Updated last year
- TableNet: Deep Learning model for end-to-end Table Detection and Tabular data extraction from Scanned Data Images In modern times, more a…☆52Updated 2 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated 10 months ago
- ☆28Updated last year
- Streamlit component for invoice document labeling☆56Updated 2 years ago
- Search PDFs using Jina, DocArray and Jina Hub☆55Updated 2 years ago
- H&M Fashion Image similarity search with Weaviate and DocArray☆42Updated 10 months ago
- NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …☆77Updated last month
- ☆49Updated 6 months ago