emsi / wordvectors
How to train Word2Vec for your language.
☆11Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for wordvectors
- ☆27Updated 2 years ago
- Evaluation of Sentence Representations in Polish☆22Updated last year
- RoBERTa models for Polish☆86Updated 2 years ago
- Polish morphological tagger.☆43Updated last year
- Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good mode…☆33Updated 3 years ago
- HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.☆65Updated 2 years ago
- Popular stopwords for general languages - very usefull for building dictionaries, searchers or text indexes☆45Updated 11 years ago
- Pre-trained models and language resources for Natural Language Processing in Polish☆325Updated 5 months ago
- A curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.☆294Updated 3 years ago
- E3C is a freely available multilingual corpus (Italian, English, French, Spanish, and Basque) of semantically annotated clinical narrativ…☆24Updated 10 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆242Updated last year
- Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service☆49Updated 7 months ago
- Building a text classifier with extremely small datasets☆44Updated 5 years ago
- A Greek edition of BERT pre-trained language model☆142Updated 4 months ago
- ☆50Updated 2 years ago
- Tool for named entity recognition for Polish based on deep learning.☆30Updated last year
- This repo is the home of Romanian Transformers.☆93Updated 2 years ago
- Mapping a variable-length sentence to a fixed-length vector using BERT model☆123Updated 5 years ago
- code and supplementary materials for a series of Medium articles about the BERT model☆77Updated last year
- Clustering sentence embeddings to extract message intent☆167Updated 3 years ago
- HuSpaCy: industrial-strength Hungarian natural language processing☆155Updated 3 weeks ago
- Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.☆519Updated 3 months ago
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆135Updated last year
- GilBERTo: A pretrained language model based on RoBERTa for Italian☆73Updated 4 years ago
- NLP French language model implementing ULMFiT☆87Updated 5 years ago
- This is where I put things I find useful that speed up my work with Machine Learning. Ever looked in your old projects to reuse those coo…☆257Updated 2 years ago
- The official tool for transforming doccano format into common dataset formats.☆105Updated last year
- Text tokenization and sentence segmentation (segtok v2)☆203Updated 2 years ago
- Romanian WordNet (Data + API for Python)☆49Updated 4 years ago
- Resources for doing NLP in Polish☆44Updated 5 years ago