openredact / expose-text
This is a prototype of a Python module for simple modification of document files.
โ18Updated 3 years ago
Alternatives and similar repositories for expose-text:
Users that are interested in expose-text are comparing it to the libraries listed below
- This is a prototype of a semi-automatic data anonymization app for German documents.โ20Updated 2 years ago
- ๐ Dehyphenation of broken text (mainly German), i.e., extracted from a PDFโ38Updated 3 years ago
- BERT and ELECTRA models trained on Europeana Newspapersโ38Updated 3 years ago
- This repository contains all manually labeled data from the GermEval-2018 shared task.โ30Updated 6 years ago
- Repository for "Towards Robust Named Entity Recognition for Historic German"โ18Updated 4 years ago
- German lemmatization with IWNLP as extension for spaCyโ24Updated last year
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.โ19Updated 2 years ago
- Plan and train German transformer models.โ23Updated 4 years ago
- GC4LM: A Colossal (Biased) language model for Germanโ13Updated 3 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissionsโ19Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doโฆโ80Updated 9 months ago
- A part-of-speech tagger with support for domain adaptation and external resources.โ22Updated 2 years ago
- Language Model and Text Classification for German Language using Deep Learningโ18Updated 6 years ago
- Legal Reference Extractionโ29Updated 8 months ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheniโฆโ12Updated last year
- Named entity annotation toolโ27Updated last year
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers atโฆโ22Updated 8 months ago
- Use spaCy for NLP and output to the FoLiA XML format.โ12Updated last year
- CLI for loading Wikidata subsets (or all of it) into Elasticsearchโ70Updated 3 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataโ94Updated 2 years ago
- ๐งช Cutting-edge experimental spaCy components and featuresโ98Updated last year
- Get annotation suggestions for the INCEpTION text annotation platform from spaCy, Sentence BERT, scikit-learn and more. Runs as a web-serโฆโ45Updated 6 months ago
- ๐งฎ Python package to construct word embeddings for small data using PMI and SVDโ17Updated 4 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsโ54Updated 2 years ago
- Mining Legal Arguments in Court Decisions - Data and softwareโ67Updated last year
- Compiled tools, datasets, and other resources for historical text normalization.โ18Updated 5 years ago
- Compute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.โ58Updated 5 months ago
- Coreference resolution for Germanโ16Updated 7 years ago
- Named entity recognition for the legal domainโ42Updated 3 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.โ21Updated last year