BobLd / PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
☆27Updated 5 years ago
Alternatives and similar repositories for PdfPigMLNetBlockClassifier:
Users that are interested in PdfPigMLNetBlockClassifier are comparing it to the libraries listed below
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆31Updated 3 years ago
- Port of PragmaticSegmenter for sentence boundary detection☆35Updated 3 years ago
- Natural Language Processing Engine built with ML.NET☆25Updated 2 years ago
- PMS full-text search engine with no external dependencies written in C#☆23Updated last year
- Cross-platform library to render pdf documents as images with PdfPig using SkiaSharp☆27Updated 2 weeks ago
- PDF viewer and editor toolset.☆44Updated last week
- A tool for detecting identifiable information in data sources (CSV, DICOM, Relational Database and MongoDB)☆14Updated this week
- A .NET library to aid WebView2 control hosting, .NET/JavaScript interop and Html to Pdf Conversion☆37Updated last month
- Free PDF renderer for .NET☆67Updated last month
- ☆14Updated last week
- Text clustering algorithm, implemented in .NET☆22Updated last year
- ☆17Updated last year
- .Net Implementation for google word2vec tools.☆37Updated 2 years ago
- Inject deep copy constructors into C# types☆13Updated 2 years ago
- A set of variations on ObjectPool implementations with differing underlying collections.☆17Updated 2 months ago
- Create binding from .Net to JavaScript☆16Updated last week
- .NET wrapper of spaCy (Industrial-strength NLP)☆18Updated 5 years ago
- NLTK library wrapper for .NET☆47Updated 2 weeks ago
- Using a MaskRCNN model trained on the PublayNet dataset with ML.Net in C# / .Net for Document layout analysis and page segmmentation task…☆16Updated last year
- SpacyDotNet is a .NET wrapper for the popular natural language library spaCy☆33Updated 3 years ago
- Word2Vec.Net-CSharp☆18Updated 6 years ago
- .NET Core Proxy library based on HttpClient works with FreeProxyList.net☆19Updated 2 years ago
- .NET client for Qdrant vector database☆17Updated last year
- A .NET Standard port of the Argotic Syndication Framework for RSS / ATOM / RSD / OPML / APML / BlogML / Yahoo Media / iTunes☆16Updated 2 weeks ago
- Reed-Solomon Erasure Coding in C#/.NET☆26Updated last year
- C# SDK based on official HuggingFace OpenAPI specification☆37Updated last week
- Low code automation test cases for web and desktop applications and more !!☆19Updated 3 weeks ago
- ☆34Updated 11 months ago
- Research EXpression Language☆23Updated 8 months ago
- Extensions methods for PDFSharp to simplify common operations, including image extraction.☆35Updated 2 years ago