proycon / analiticcl
an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction
☆33Updated this week
Alternatives and similar repositories for analiticcl:
Users that are interested in analiticcl are comparing it to the libraries listed below
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆14Updated 6 months ago
- Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.☆73Updated last year
- ☆32Updated 2 years ago
- Pure Rust port of CRFsuite: a fast implementation of Conditional Random Fields (CRFs)☆29Updated 3 months ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆38Updated 2 years ago
- Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot☆13Updated 4 years ago
- Rust binding to crfsuite☆25Updated 2 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 9 months ago
- A Named-Entity Recogniser based on Grobid.☆50Updated 5 months ago
- Named entity annotation tool☆27Updated last year
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆17Updated 2 weeks ago
- Modular Rust transformer/LLM library using Candle☆36Updated 9 months ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Updated 5 years ago
- PDF parser powered by grobid☆25Updated 6 months ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆48Updated 2 years ago
- Process, enhance and evaluate multiple OCR output.☆22Updated 3 months ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Named entity recognition for the legal domain☆41Updated 3 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆54Updated last year
- Layout Analysis Dataset with Segmonto (LADaS)☆19Updated last week
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆22Updated 6 months ago
- Fast, permanent and flexible patterns for sharing and computing on texts with metadata using Apache Arrow.☆14Updated 2 years ago
- Discourse Analysis Tool Suite☆18Updated this week
- Keeping It Simple is Hard☆10Updated last year
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated last year
- 🌸 Train floret vectors☆18Updated last year
- A deep learning architecture for reference mining from literature in the arts and humanities.☆15Updated 5 years ago
- Rust binding for the sentencepiece library☆20Updated last year
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆25Updated 2 years ago