BobLd / PdfPigMLNetBlockClassifierLinks
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
☆28Updated 5 years ago
Alternatives and similar repositories for PdfPigMLNetBlockClassifier
Users that are interested in PdfPigMLNetBlockClassifier are comparing it to the libraries listed below
Sorting:
- Port of PragmaticSegmenter for sentence boundary detection☆35Updated 3 years ago
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆33Updated 3 years ago
- PdfDocumentParser is a .NET toolset for building PDF parsers.☆45Updated last year
- Fast and memory-efficient library for WordPiece tokenization as it is used by BERT.☆49Updated this week
- .NET assembly class responsible for converting OpenXml based documents into corrisponding dotnet code☆46Updated 3 weeks ago
- A tool for detecting identifiable information in data sources (CSV, DICOM, Relational Database and MongoDB)☆13Updated this week
- Open source project for BERT Tokenizers in C#.☆86Updated 2 years ago
- Natural Language Processing Engine built with ML.NET☆25Updated 2 years ago
- PDF viewer and editor toolset.☆48Updated 3 months ago
- C# and VB.NET samples for Docotic.Pdf library☆78Updated last week
- A docx renderer allows outputing Markdown-formatted text into Microsoft Word .docx documents☆19Updated last year
- A lightweight C# Library to render PDFs with Google's Pdfium in .NET Core and .NET Framwork Apps.☆72Updated 4 years ago
- C# Word2Vec object with fast neighbor search. Format compatible with gensim☆25Updated 5 years ago
- A .NET library to aid WebView2 control hosting, .NET/JavaScript interop and Html to Pdf Conversion☆39Updated last month
- .NET Standard wrapper for fastText library. Now works on Windows, Linux and MacOs!☆76Updated 9 months ago
- This project implements token calculation for OpenAI's gpt-4 and gpt-3.5-turbo model, specifically using `cl100k_base` encoding.☆76Updated last week
- Free PDF renderer for .NET☆69Updated last week
- Word2Vec.Net-CSharp☆18Updated 6 years ago
- C# bindings for MuPDF☆81Updated 2 weeks ago
- ASP.NET Core Web, WebApi & WPF implementations for LLama.cpp & LLamaSharp☆58Updated last year
- .NET wrapper around Google's PDFium library☆26Updated last year
- Fast dynamic CSV records reader and writer extensions for CsvHelper☆21Updated last year
- Cross-platform library to render pdf documents as images with PdfPig using SkiaSharp☆32Updated this week
- A C# wrapper written around QPDF, allowing for various operations on PDF documents: transformations, page manipulation, linearization, an…☆20Updated last year
- A lightweight .NET library with things to maximize productivity.☆28Updated last month
- Create Latex Document in C# with Object-oriented programming☆24Updated 8 years ago
- A C# port of Mozilla Universal Charset Detector.☆27Updated 5 years ago
- ☆29Updated 2 weeks ago
- An OpenType typeface utilities.☆8Updated 6 months ago
- PMS full-text search engine with no external dependencies written in C#☆23Updated last year