BobLd / PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
☆28Updated 5 years ago
Alternatives and similar repositories for PdfPigMLNetBlockClassifier:
Users that are interested in PdfPigMLNetBlockClassifier are comparing it to the libraries listed below
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆31Updated 3 years ago
- Port of PragmaticSegmenter for sentence boundary detection☆35Updated 3 years ago
- Natural Language Processing Engine built with ML.NET☆25Updated 2 years ago
- Cross-platform library to render pdf documents as images with PdfPig using SkiaSharp