proycon / analiticclLinks
an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction
☆36Updated 4 months ago
Alternatives and similar repositories for analiticcl
Users that are interested in analiticcl are comparing it to the libraries listed below
Sorting:
- Pure Rust port of CRFsuite: a fast implementation of Conditional Random Fields (CRFs)☆29Updated 2 months ago
- Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.☆78Updated last year
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- An efficient implementation of Partitioned Label Trees & its variations for extreme multi-label classification☆88Updated last year
- Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot☆13Updated 4 years ago
- Rust binding to crfsuite☆25Updated 3 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆49Updated 2 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 5 months ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼☆22Updated 5 months ago
- Rust binding for the sentencepiece library☆22Updated 2 months ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆17Updated last month
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- notes on nushell☆10Updated 3 weeks ago
- Fast English word segmentation in Rust☆99Updated 3 weeks ago
- Modular Rust transformer/LLM library using Candle☆36Updated last year
- my take at a PDF text extraction utility☆14Updated 10 years ago
- This is the project planning repository for the CLARIAH-PLUS project. It groups all technical documents and discussions pertaining to CLA…☆10Updated 4 months ago
- A collection of open source tools and resources related to Wikibase knowledge graphs☆72Updated last year
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- A tool for learning significant phrase/term models, and efficiently labeling with them.☆33Updated 2 months ago
- Rust port of https://github.com/UKPLab/sentence-transformers☆29Updated 5 years ago
- Simple NLP in Rust with Python bindings☆152Updated 2 years ago
- GSDMM: Short text clustering (Rust implementation)☆22Updated 2 years ago
- PDF command-line utils written in Rust☆39Updated 3 months ago
- Process, enhance and evaluate multiple OCR output.☆22Updated 8 months ago
- PAGE XML format collection for document image page content and more☆67Updated 4 years ago
- Fast, permanent and flexible patterns for sharing and computing on texts with metadata using Apache Arrow.☆14Updated 3 years ago
- Corpus Build OCR platform☆8Updated 2 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago