PeterisP / LVTagger
☆17Updated last month
Related projects ⓘ
Alternatives and complementary repositories for LVTagger
- Full Stack of Latvian Language Resources for Natural Language Understanding (NLU) and Generation (NLG)☆14Updated 2 years ago
- Latvian morphology module☆33Updated this week
- e-magyar text processing system -- inter-module communication via tsv + REST API☆27Updated 11 months ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆48Updated 2 years ago
- Wrapper for DKPro Core to extract lingustic information from books.☆16Updated 2 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- PurePos is an open source hybrid morphological tagger.☆15Updated 4 years ago
- A tool for automatic spelling normalization☆20Updated 3 years ago
- Multi Tier Annotation Search☆26Updated 3 years ago
- Detect and align similar passages☆88Updated 2 months ago
- A Named-Entity Recogniser based on Grobid.☆49Updated 2 months ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆86Updated 10 months ago
- Morphological analyzer and lemmatizer for Latin.☆25Updated last week
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆22Updated last year
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆65Updated this week
- Conversions between various OCR formats☆71Updated last year
- The home repository of the NerKor corpus, a Hungarian gold standard named entity annotated corpus containing 1 million tokens.☆14Updated last year
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆61Updated this week
- Advanced graph rewriting and LLOD publication for CoNLL and other TSV formats☆25Updated 6 months ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆50Updated 4 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆36Updated 2 years ago
- Language Tool style grammar handling with spaCy 2.0☆42Updated 6 years ago
- A textual corpus database for the digital humanities.☆59Updated 4 years ago
- The Unicode Cookbook for Linguists☆53Updated 4 years ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆27Updated 3 years ago
- A software to detect text reuse with BLAST.☆14Updated 5 years ago
- High-performance text aligner for large collections of texts☆45Updated 3 weeks ago
- PhiloLogic4☆37Updated 4 months ago