repodiac / german_compound_splitter
Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern string search
☆22Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for german_compound_splitter
- Open German WordNet☆88Updated 9 months ago
- Compound splitter for German☆103Updated 4 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆24Updated last year
- UIMA CAS processing library written in Python☆85Updated 6 months ago
- Catalan bert model☆12Updated 4 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆36Updated last year
- Compiled tools, datasets, and other resources for historical text normalization.☆16Updated 5 years ago
- ☆43Updated 3 months ago
- Small-vocabulary sequence-to-sequence generation with optional feature conditioning☆31Updated this week
- coFR: COreference resolution tool for FRench (and singletons).☆24Updated 4 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated 3 months ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- ☆18Updated this week
- Linguistic and stylistic complexity measures for (literary) texts☆77Updated 9 months ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- A neural dependency parser that does its best☆15Updated this week
- Python Finite-State Toolkit☆45Updated last week
- BERT and ELECTRA models trained on Europeana Newspapers☆36Updated 2 years ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆75Updated 3 years ago
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.☆24Updated 6 months ago
- Deutsches Lyrik Korpus (DLK) / German Poetry Corpus☆17Updated 6 months ago
- Dutch coreference resolution & dialogue analysis using deterministic rules☆21Updated last year
- SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages☆7Updated 9 months ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆27Updated 5 months ago
- German Morphological Analyzer☆47Updated 3 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆144Updated this week
- This packages up data for the Open Multilingual Wordnet☆43Updated 3 weeks ago
- Morphological Inflection for Low-Resource Languages using cross-lingual transfer☆20Updated 4 years ago
- A tool for automatic spelling normalization☆20Updated 3 years ago
- A merged version of multiple open-source German speech datasets.☆30Updated 6 months ago