Liebeck / IWNLP-py
Python port for IWNLP.Lemmatizer
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for IWNLP-py
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- Language detection extension for spaCy 2.0+☆111Updated 5 years ago
- German Morphological Analyzer☆47Updated 2 years ago
- German sentiment scores with SentiWS as extension for spaCy☆36Updated last year
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- A lemmatizer for German language text☆87Updated last year
- Deutsch Language Tool Kit☆12Updated 9 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- Extract, parse and populate templates from strings☆27Updated 5 years ago
- Language Tool style grammar handling with spaCy 2.0☆42Updated 6 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆65Updated 2 years ago
- Annotated corpus of data from War of The Rebellion (American Civil War archives)☆16Updated 8 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 5 years ago
- A web application tagging and retrieval of arguments in text☆30Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated 3 months ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆60Updated this week
- Aho-Corasick string replacement utility☆23Updated 4 years ago
- Hidden alignment conditional random field for classifying string pairs.☆37Updated 7 years ago
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆144Updated 10 months ago
- Finds linguistic patterns effortlessly☆33Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆38Updated 2 years ago
- Hunspell extension for spaCy 2.0.☆94Updated 3 months ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆50Updated 4 years ago