openredact / expose-textLinks
This is a prototype of a Python module for simple modification of document files.
β18Updated 3 years ago
Alternatives and similar repositories for expose-text
Users that are interested in expose-text are comparing it to the libraries listed below
Sorting:
- German lemmatization with IWNLP as extension for spaCyβ26Updated 2 years ago
- π Dehyphenation of broken text (mainly German), i.e., extracted from a PDFβ39Updated 3 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissionsβ19Updated 2 years ago
- This repository contains all manually labeled data from the GermEval-2018 shared task.β29Updated 7 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsβ56Updated 2 years ago
- spaCy + UDPipeβ163Updated 3 years ago
- Next-generation Punkt sentence boundary detection with zero dependenciesβ24Updated 3 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doβ¦β80Updated last year
- Python port for IWNLP.Lemmatizerβ18Updated 2 years ago
- Repository for "Towards Robust Named Entity Recognition for Historic German"β18Updated 4 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataβ94Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapersβ38Updated 3 years ago
- CLI for loading Wikidata subsets (or all of it) into Elasticsearchβ70Updated 3 years ago
- Named entity recognition for the legal domainβ42Updated 4 years ago
- A part-of-speech tagger with support for domain adaptation and external resources.β23Updated 3 years ago
- Legal Reference Extractionβ36Updated 6 months ago
- Anonymization of legal cases (Fr) based on Flair embeddingsβ87Updated 4 years ago
- The Open Multilingual Wordnetβ65Updated last year
- Language Model and Text Classification for German Language using Deep Learningβ18Updated 7 years ago
- Plan and train German transformer models.β23Updated 4 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.β152Updated this week
- GC4LM: A Colossal (Biased) language model for Germanβ13Updated 4 years ago
- Use spaCy for NLP and output to the FoLiA XML format.β12Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidataβ168Updated 3 years ago
- Norwegian Named Entities annotations on top of NDT (Norwegian Dependency Treebank)β70Updated last year
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.β19Updated 3 years ago
- CONLL-U to Pandas DataFrameβ31Updated 8 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.β147Updated 11 months ago
- β70Updated 2 years ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", preβ¦β84Updated 4 years ago