kuhumcst / cstlemma
Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefix, infix, suffix, circumfix). Rules are obtained by supervised learning from a full form - lemma list.
☆36Updated last week
Alternatives and similar repositories for cstlemma:
Users that are interested in cstlemma are comparing it to the libraries listed below
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆112Updated 2 months ago
- linguistics backend☆41Updated 2 years ago
- Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia description…☆11Updated 2 years ago
- Convert CoNLL output of a dependency parser into a latex or graphviz tree☆12Updated 5 years ago
- KenLM extension for spaCy 2.0.☆16Updated 7 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- Lexicons for the Multilingual UCREL Semantic Analysis System☆41Updated last year
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, N…☆26Updated 2 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- spaCy + UDPipe☆161Updated 3 years ago
- Multi Tier Annotation Search☆26Updated 3 years ago
- Python library providing sentiment lexicons.☆26Updated 8 years ago
- ☆54Updated 9 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- Featurize words into orthographic and phonological vectors.☆40Updated last year
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆63Updated 11 months ago
- Language detection extension for spaCy 2.0+☆112Updated 6 years ago
- Detect and align similar passages☆100Updated 2 months ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Updated 8 years ago
- The Potsdam Twitter Sentiment Corpus☆17Updated 5 years ago
- Search back-end for dependency tree search. See the docs at https://fginter.github.io/dep_search/☆17Updated 7 years ago
- Hunspell extension for spaCy 2.0.☆94Updated 8 months ago
- MiTextExplorer - interactive browser of text and document covariates.☆24Updated 9 years ago
- 📂 Additional lookup tables and data resources for spaCy☆106Updated 2 months ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 7 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Language Tool style grammar handling with spaCy 2.0☆42Updated 6 years ago
- Jupyter extension to visualize dependency structures☆28Updated 7 years ago