konfuzio-ai / konfuzio-sdk

Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision models tailored for your specific use cases. Find examples with code in our Tutorials section of dev.konfuzio.com and get inspiration from Use Cases section of our blog: https://konfuzio.com/en/category/marketpl…

☆61

Alternatives and similar repositories for konfuzio-sdk:

Users that are interested in konfuzio-sdk are comparing it to the libraries listed below

deepdoctection / notebooks
Repository for deepdoctection tutorial notebooks
☆40Updated last month
marieai / marie-ai
Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…
☆63Updated this week
papercast-dev / papercast
A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…
☆46Updated 5 months ago
butlerlabs / docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…
☆19Updated 2 years ago
Unstructured-IO / community
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
☆26Updated last year
shahrukhx01 / multilingual-pdf2text
A python library for extracting text from PDFs without losing the formatting of the PDF content.
☆75Updated 3 years ago
stanfordnlp / pdf-struct
Logical structure analysis for visually structured documents
☆85Updated 2 years ago
Layout-Parser / annotation-service
☆15Updated 3 years ago
JustlyAI / lmss_entity_extractor
Tool to apply Legal Matter Specification Standard (LMSS) to documents
☆12Updated 5 months ago
ChrizH / pdfstructure
`pdfstructure` detects, splits and organizes the documents text content into its natural structure as envisioned by the author.
☆102Updated 9 months ago
DS4SD / deepsearch-examples
Examples using the Deep Search functionalities
☆56Updated this week
GeorgeLuImmortal / DocLLM_reimplementation
☆21Updated 10 months ago
innerdoc / nlp-history-timeline
A Streamlit app for showing a TimelineJS about the history of Natural Language Processing
☆26Updated last year
dswang2011 / DocLLM
DocLLM: A layout-aware generative language model for multimodal document understanding
☆119Updated last year
s-emanuilov / litepali
LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.
☆34Updated 3 months ago
neuml / txtmarker
🖍️ Highlight text in documents
☆99Updated 3 weeks ago
nainiayoub / pdf-text-data-extractor
PDF text data extraction web app with OCR for scanned documents
☆83Updated 7 months ago
louisbrulenaudet / docutron
Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.
☆25Updated last year
nlmatics / nlm-tika
☆22Updated 7 months ago
ocrmypdf / OCRmyPDF-EasyOCR
OCRmyPDF EasyOCR plugin
☆56Updated 4 months ago
DS4SD / docling-core
A python library to define and validate data types in Docling.
☆56Updated this week
neelguha / legal-segmenter
A simple library for segmenting legal texts
☆15Updated last year
AmanSavaria1402 / TableNet
TableNet: Deep Learning model for end-to-end Table Detection and Tabular data extraction from Scanned Data Images In modern times, more a…
☆52Updated 2 years ago
explosion / spacy-huggingface-pipelines
💥 Use Hugging Face text and token classification pipelines directly in spaCy
☆63Updated 10 months ago
Unstructured-IO / unstructured-api-tools
☆28Updated last year
katanaml / streamlit-sparrow-labeling-comp
Streamlit component for invoice document labeling
☆56Updated 2 years ago
alexcg1 / example-pdf-search
Search PDFs using Jina, DocArray and Jina Hub
☆55Updated 2 years ago
laura-ham / HM-Fashion-image-neural-search
H&M Fashion Image similarity search with Weaviate and DocArray
☆42Updated 10 months ago
nlpcloud / nlpcloud-python
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …
☆77Updated last month
huridocs / pdf_paragraphs_extraction
☆49Updated 6 months ago