openredact / expose-textLinks
This is a prototype of a Python module for simple modification of document files.
โ18Updated 3 years ago
Alternatives and similar repositories for expose-text
Users that are interested in expose-text are comparing it to the libraries listed below
Sorting:
- This is a prototype of a semi-automatic data anonymization app for German documents.โ23Updated 2 years ago
- ๐ Dehyphenation of broken text (mainly German), i.e., extracted from a PDFโ39Updated 3 years ago
- This repository contains all manually labeled data from the GermEval-2018 shared task.โ29Updated 7 years ago
- German lemmatization with IWNLP as extension for spaCyโ26Updated 2 years ago
- Python port for IWNLP.Lemmatizerโ18Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsโ56Updated 2 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataโ95Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapersโ38Updated 4 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doโฆโ81Updated last year
- Hunspell extension for spaCy 2.0.โ94Updated last year
- Next-generation Punkt sentence boundary detection with zero dependenciesโ24Updated 3 weeks ago
- Python tools for interacting with Wikidataโ158Updated 2 years ago
- Information extraction from English and German texts based on predicate logicโ139Updated 2 years ago
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedingsโ16Updated 5 years ago
- โ70Updated 3 years ago
- Dataframe Integration with spaCy.โ103Updated 4 years ago
- CONLL-U to Pandas DataFrameโ31Updated 8 years ago
- spaCy + UDPipeโ163Updated 3 years ago
- Language Model and Text Classification for German Language using Deep Learningโ18Updated 7 years ago
- CLI for loading Wikidata subsets (or all of it) into Elasticsearchโ71Updated 3 years ago
- Norwegian Named Entities annotations on top of NDT (Norwegian Dependency Treebank)โ71Updated last year
- ๐งช Cutting-edge experimental spaCy components and featuresโ104Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.โ150Updated last year
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissionsโ19Updated 2 years ago
- A Dataset of German Legal Documents for Named Entity Recognitionโ172Updated 3 years ago
- A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Pythonโ112Updated 6 months ago
- ๐ Additional lookup tables and data resources for spaCyโ113Updated 6 months ago
- EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and Eโฆโ42Updated 3 years ago
- Anonymization of legal cases (Fr) based on Flair embeddingsโ87Updated 5 years ago
- Use spaCy for NLP and output to the FoLiA XML format.โ12Updated last year