BobLd / PdfPigMLNetBlockClassifier

Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
23Updated 4 years ago

Related projects

Alternatives and complementary repositories for PdfPigMLNetBlockClassifier