devmount / GermanWordEmbeddings
Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.
☆234Updated 3 weeks ago
Related projects: ⓘ
- Ten Thousand German News Articles Dataset for Topic Classification☆81Updated last year
- Named Entity Recognition data for Europeana Newspapers☆171Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated last month
- A lemmatizer for German language text☆87Updated last year
- Compound splitter for German☆102Updated 4 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆446Updated 3 weeks ago
- spaCy + UDPipe☆159Updated 2 years ago
- semi supervised guided topic model with custom guidedLDA☆497Updated 3 years ago
- 🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the…☆243Updated last year
- The Zurich Dependency Parser for German☆81Updated 2 years ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆153Updated last year
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- German Morphological Analyzer☆45Updated 2 years ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆82Updated 3 years ago
- Making sense embedding out of word embeddings using graph-based word sense induction☆212Updated 3 years ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆74Updated 3 years ago
- Open German WordNet☆87Updated 7 months ago
- Code and data for inducing domain-specific sentiment lexicons.☆195Updated last month
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆724Updated last month
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆308Updated this week
- Retrofitting Word Vectors to Semantic Lexicons☆373Updated 5 years ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- Various Algorithms for Short Text Mining☆466Updated last week
- a Deep Learning Framework for Text https://delft.readthedocs.io/☆387Updated 2 months ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆76Updated 3 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆76Updated 7 months ago
- NLP French language model implementing ULMFiT☆86Updated 5 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆249Updated 2 weeks ago
- An unsupervised compound splitter☆40Updated 4 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆139Updated 2 years ago