devmount / GermanWordEmbeddings
Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.
☆237Updated 7 months ago
Alternatives and similar repositories for GermanWordEmbeddings:
Users that are interested in GermanWordEmbeddings are comparing it to the libraries listed below
- A lemmatizer for German language text☆88Updated 2 years ago
- Named Entity Recognition data for Europeana Newspapers☆171Updated last year
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆468Updated 4 months ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆139Updated 3 months ago
- German Morphological Analyzer☆47Updated 3 years ago
- Compound splitter for German☆104Updated 4 years ago
- AmbiverseNLU: A Natural Language Understanding suite by Max Planck Institute for Informatics☆210Updated last year
- a Deep Learning Framework for Text https://delft.readthedocs.io/☆397Updated 3 weeks ago
- spaCy + UDPipe☆161Updated 2 years ago
- Making sense embedding out of word embeddings using graph-based word sense induction☆213Updated 3 years ago
- Various utilities for processing the data.☆208Updated this week
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆733Updated 7 months ago
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- A Dataset of German Legal Documents for Named Entity Recognition☆166Updated 2 years ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆156Updated 2 years ago
- Python library for Natural Language Preprocessing (NLPre)☆190Updated last year
- Language independent truecaser in Python.☆160Updated 3 years ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆314Updated last month
- Anonymization of legal cases (Fr) based on Flair embeddings☆88Updated 4 years ago
- Information extraction from English and German texts based on predicate logic☆389Updated 2 years ago
- This repository contains all manually labeled data from the GermEval-2018 shared task.☆30Updated 6 years ago
- A multilingual, cross-domain temporal tagger developed at the Database Systems Research Group at Heidelberg University.☆344Updated last year
- An introduction to using spaCy for NLP and machine learning☆191Updated 3 years ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- Various Algorithms for Short Text Mining☆470Updated this week
- UIMA CAS processing library written in Python☆87Updated this week
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- Anafora is a web-based raw text annotation tool☆241Updated 2 years ago
- spaCy pipeline object for negating concepts in text☆279Updated 9 months ago