BobLd / PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
☆28Updated 5 years ago
Alternatives and similar repositories for PdfPigMLNetBlockClassifier:
Users that are interested in PdfPigMLNetBlockClassifier are comparing it to the libraries listed below
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆31Updated 3 years ago
- Port of PragmaticSegmenter for sentence boundary detection☆35Updated 3 years ago
- Create binding from .Net to JavaScript☆16Updated last week
- Natural Language Processing Engine built with ML.NET☆25Updated 2 years ago
- A tool for detecting identifiable information in data sources (CSV, DICOM, Relational Database and MongoDB)☆13Updated this week
- PdfDocumentParser is a .NET toolset for building PDF parsers.☆45Updated 10 months ago
- A set of variations on ObjectPool implementations with differing underlying collections.☆17Updated 3 months ago
- ☆17Updated last year
- An open source machine learning framework in .net core☆15Updated 6 years ago
- Cross-platform library to render pdf documents as images with PdfPig using SkiaSharp☆28Updated this week
- .Net Or-Mapper working with dynamically implemented abstract Classes☆16Updated last year
- PMS full-text search engine with no external dependencies written in C#☆23Updated last year
- Search engine library☆29Updated 9 years ago
- .NET Core Proxy library based on HttpClient works with FreeProxyList.net☆19Updated 2 years ago
- A docx renderer allows outputing Markdown-formatted text into Microsoft Word .docx documents☆19Updated last year
- Simple application to full-text searching in file system☆17Updated 4 years ago
- .NET wrapper of spaCy (Industrial-strength NLP)☆18Updated 5 years ago
- .NET 9 WPF CMS, Static Site Generation, Geo Tools, Cloud Backup, Feed Reader and PowerShell Runner☆10Updated 3 months ago
- Free PDF renderer for .NET☆67Updated last week
- PDF viewer and editor toolset.☆47Updated last month
- Fast and memory-efficient library for WordPiece tokenization as it is used by BERT.☆48Updated last month
- A .NET library to aid WebView2 control hosting, .NET/JavaScript interop and Html to Pdf Conversion☆37Updated 2 months ago
- A C# wrapper written around QPDF, allowing for various operations on PDF documents: transformations, page manipulation, linearization, an…☆18Updated last year
- ☆14Updated 2 weeks ago
- A simple NER implementation using a DistilBERT based model with ML.NET☆13Updated 3 years ago
- .NET assembly class responsible for converting OpenXml based documents into corrisponding dotnet code☆42Updated last year
- ☆27Updated 2 weeks ago
- Reed-Solomon Erasure Coding in C#/.NET☆26Updated last year
- Biser is a cross-platform Binary and JSON Serializer for .NET / dotnet / core / standard / CoreRT / Mono WASM / C#☆42Updated 6 months ago
- C# Dictionary Backed By FASTER☆18Updated 2 years ago