Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
☆28Mar 16, 2020Updated 6 years ago
Alternatives and similar repositories for PdfPigMLNetBlockClassifier
Users that are interested in PdfPigMLNetBlockClassifier are comparing it to the libraries listed below
Sorting:
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆35Feb 4, 2022Updated 4 years ago
- Port of PragmaticSegmenter for sentence boundary detection☆39Sep 21, 2021Updated 4 years ago
- Document Layout Analysis resources repos for development with PdfPig.☆631Oct 1, 2023Updated 2 years ago
- Converts docx to html☆14Updated this week
- A docx renderer allows outputing Markdown-formatted text into Microsoft Word .docx documents☆19Nov 25, 2023Updated 2 years ago
- PdfDocumentParser is a .NET toolset for building PDF parsers.☆45Feb 5, 2026Updated last month
- Compress PDF documents with help of ITextSharp and FreeImage third party libs. Excellent point to start and customize for your particular…☆22Oct 31, 2017Updated 8 years ago
- A C# wrapper for the WORLD vocoder☆24Jun 21, 2021Updated 4 years ago
- .NET wrapper around Google's PDFium library☆27Jan 10, 2024Updated 2 years ago
- ☆36Mar 2, 2026Updated 2 weeks ago
- Extract tables from PDF files (port of tabula-java)☆205Mar 17, 2025Updated 11 months ago
- C# Library for converting PDF files to Searchable PDF Files☆30Jun 7, 2024Updated last year
- Extract tables (and paragraphs outside tables) from pdf☆34Nov 27, 2023Updated 2 years ago
- JPEG decoder, encoder and optimizer implemented in C#.☆34Apr 15, 2024Updated last year
- .NET Standard wrapper for fastText library. Now works on Windows, Linux and MacOs!☆80Updated this week
- Clever Stocker☆11Jan 12, 2022Updated 4 years ago
- Elasticsearch provider for Examine in Umbraco v8☆12Jan 15, 2024Updated 2 years ago
- Awesome Entity Alignment is a collection of EA techniques, including papers, codes, and datasets.☆11Oct 27, 2022Updated 3 years ago
- A very lightweight editor to preview your changes in the XAML Path Markup☆11Apr 28, 2014Updated 11 years ago
- Search Bar Spotlight-like for Windows 10☆11Oct 5, 2023Updated 2 years ago
- MJPEG Streaming (Screen-WinForms, Camera-UWP)☆14Sep 28, 2017Updated 8 years ago
- A fully customizable and modern Flutter media picker inspired by Instagram. Supports image/video selection, multi-pick, album browsing, a…☆11Feb 21, 2026Updated 3 weeks ago
- This is my speaker recognition implementation based on the x-vector system described in "X-Vectors: Robust DNN Embeddings for Speaker Rec…☆10Nov 3, 2022Updated 3 years ago
- Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.☆11Updated this week
- A customizeable React feedback form with optional screenshot via screen capture and canvas editor based on material-ui.☆12Jan 22, 2026Updated last month
- Second edition for the cron.weekly generator: gets all bookmarks from the Pocket API and structures the markdown.☆10Apr 24, 2020Updated 5 years ago
- Article Analysis Assistant☆19Feb 24, 2026Updated 2 weeks ago
- Tools for extracting tabular data from PDFs, using pdfminer☆13Nov 13, 2023Updated 2 years ago
- A Paint.NET FileType plugin that loads and saves Paint Shop Pro images.☆10Jul 24, 2024Updated last year
- Code and data for the CIKM2021 paper "Learning Ideological Embeddings From Information Cascades"☆10Sep 8, 2021Updated 4 years ago
- Renders html to pdf or pngs☆12Updated this week
- DocNetExtended is a small extension library built upon the DocNet library, designed to extract text in a readable order from PDFs☆10Nov 12, 2021Updated 4 years ago
- WPF Transitions library with set of nice transitions☆11Jan 31, 2021Updated 5 years ago
- MFM workshop project☆14Jan 25, 2021Updated 5 years ago
- A Blazor component wrapper for Lottie Web.☆15Aug 2, 2025Updated 7 months ago
- A media player.☆11Feb 1, 2026Updated last month
- This eBay module is not supported anymore. Please use eBay module version 2: https://addons.prestashop.com/en/marketplaces/27282-ebay-20-…☆11May 2, 2017Updated 8 years ago
- Bad link reporter for GitHub repositories☆13Mar 25, 2024Updated last year
- An appointment scheduler application powered by Google Calendar API, Python (Flask), Javascript (React)☆10Feb 13, 2026Updated last month