BobLd / PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
☆23Updated 4 years ago
Related projects: ⓘ
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆31Updated 2 years ago
- Port of PragmaticSegmenter for sentence boundary detection☆32Updated 2 years ago
- .NET Core Proxy library based on HttpClient works with FreeProxyList.net☆19Updated last year
- Natural Language Processing Engine built with ML.NET☆24Updated last year
- C# Word2Vec object with fast neighbor search. Format compatible with gensim☆25Updated 4 years ago
- Cross-platform library to render pdf documents as images with PdfPig using SkiaSharp☆13Updated last week
- Word2Vec.Net-CSharp☆18Updated 5 years ago
- ONNX format parsing and manipulation in C#.☆24Updated last year
- C# library for fast embeddings projection using Uniform Manifold Approximation and Projection☆37Updated 9 months ago
- C# Library for converting PDF files to Searchable PDF Files☆27Updated 3 months ago
- .Net Implementation for google word2vec tools.☆37Updated last year
- .NET Standard wrapper for fastText library. Now works on Windows, Linux and MacOs!☆67Updated last year
- .NET wrapper of spaCy (Industrial-strength NLP)☆17Updated 5 years ago
- Free PDF renderer for .NET☆57Updated last week
- Open source project for BERT Tokenizers in C#.☆76Updated last year
- NER (Named Entity Recognition) implementation using a BERT/DistilBERT-based ONNX model for Token Classification in ML.NET☆17Updated 2 weeks ago
- BERT Model for dotnet ML☆93Updated last year
- Create binding from .Net to JavaScript☆15Updated this week
- .NET client for Qdrant vector database☆17Updated 9 months ago
- SpacyDotNet is a .NET wrapper for the popular natural language library spaCy☆30Updated 2 years ago
- C# Dictionary Backed By FASTER☆18Updated last year
- PDF viewer and editor toolset.☆31Updated 2 weeks ago
- Cross-platform pdf reader application☆23Updated this week
- Fast and memory-efficient library for WordPiece tokenization as it is used by BERT.☆38Updated 2 months ago
- Search Photo based in content of the photo using ONNX model and ML.Net☆13Updated 5 years ago
- Just xps2pdf☆16Updated last year
- Implement llama 2/3 using torchsharp☆12Updated 3 weeks ago
- Reed-Solomon Erasure Coding in C#/.NET☆25Updated last year
- A .NET Standard port of the Argotic Syndication Framework for RSS / ATOM / RSD / OPML / APML / BlogML / Yahoo Media / iTunes☆14Updated last week