com3dian / GrobidmonkeyLinks
The grobidmonkey package is an open-source package designed for postprocessing GROBID outputs.
☆12Updated last year
Alternatives and similar repositories for Grobidmonkey
Users that are interested in Grobidmonkey are comparing it to the libraries listed below
Sorting:
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆70Updated 3 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆20Updated 2 years ago
- Leveraging LLMs for Post-OCR Correction of Historical Newspapers☆15Updated last year
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 3 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆74Updated 10 months ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆35Updated 10 months ago
- A spaCy custom component that extracts and normalizes temporal expressions☆56Updated 2 years ago
- Python Finite-State Toolkit☆60Updated last month
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆45Updated 2 years ago
- A small python library to parse and write TSV files generated by the WebAnno software.☆12Updated 9 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆170Updated 3 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆112Updated 2 years ago
- This is a simple Python package for calculating a variety of lexical diversity indices☆82Updated 2 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆112Updated 3 weeks ago
- A simple toolkit for conducting analyses using corpus methods☆27Updated 4 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆81Updated last year
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 4 years ago
- ☆81Updated last week
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated last year
- Gamma Agreement in Python☆45Updated last year
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated last year
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆98Updated 2 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 3 years ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- ☆34Updated 2 years ago
- ☆50Updated last year
- Tool to fix bitexts and tag near-duplicates for removal☆34Updated 5 months ago
- Fast computation of Krippendorff's alpha agreement measure in Python.☆154Updated 2 months ago
- OpusFilter - Parallel corpus processing toolkit☆115Updated this week
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆31Updated 5 years ago