coosto / dutch-word-embeddings
Dutch word embeddings, trained on a large collection of Dutch social media messages and news/blog/forum posts.
☆43Updated 2 years ago
Related projects: ⓘ
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆133Updated last year
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆82Updated 3 years ago
- A Dutch RoBERTa-based language model☆196Updated 5 months ago
- 110k Dutch Book Reviews Dataset for Sentiment Analysis☆30Updated 11 months ago
- Language Models for Zalando's flair library☆62Updated 4 years ago
- Athens NLP Summer School Labs☆42Updated 6 months ago
- A python wrapper for the multilingual temporal tagger HeidelTime.☆26Updated 2 years ago
- spaCy pipeline object for negating concepts in text☆273Updated 3 months ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆87Updated 3 years ago
- spaCy + UDPipe☆159Updated 2 years ago
- The weights for the embedding layer of Scandinavian UMLFiT language models☆33Updated 4 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆76Updated 7 months ago
- AlBERTo the first italian BERT model for Twitter languange understanding☆70Updated 4 years ago
- E3C is a freely available multilingual corpus (Italian, English, French, Spanish, and Basque) of semantically annotated clinical narrativ…☆22Updated 8 months ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆74Updated 3 years ago
- ☆64Updated last year
- Approximate randomization testing.☆18Updated 4 years ago
- UIMA CAS processing library written in Python☆84Updated 4 months ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆153Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆72Updated 2 months ago
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.☆74Updated 2 years ago
- A multilingual lexicon of words to hurt.☆77Updated 3 weeks ago
- Training Temporal Word Embeddings with a Compass☆63Updated last year
- This repository contains all new resources that were created for the NAACL-2018 paper "Inducing a Lexicon of Abusive Words -- A Feature-B…☆27Updated 5 years ago
- NLP French language model implementing ULMFiT☆86Updated 5 years ago
- Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks☆155Updated last year
- Multi-Annotator Competence Estimation tool☆62Updated 5 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆149Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated last month
- A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Python☆99Updated 7 months ago