allegro / HerBERTLinks

HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.

☆67

Alternatives and similar repositories for HerBERT

Users that are interested in HerBERT are comparing it to the libraries listed below

Sorting:

ksopyla / awesome-nlp-polish
A curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.
☆304Updated 4 years ago
sdadas / polish-nlp-resources
Pre-trained models and language resources for Natural Language Processing in Polish
☆357Updated last year
sheerun / awesome-polish-nlp
Resources for doing NLP in Polish
☆47Updated 5 years ago
sdadas / polish-roberta
RoBERTa models for Polish
☆88Updated 3 years ago
ipipan / spacy-pl
☆50Updated 3 years ago
ZILiAT-NASK / BAN-PL
Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
☆57Updated 8 months ago
dzieciou / pystempel
Python port of Stempel, an algorithmic stemmer for Polish language.
☆39Updated last year
kldarek / polbert
Polish BERT
☆72Updated 4 years ago
kwrobel-nlp / krnnt
Polish morphological tagger.
☆43Updated 2 years ago
allegro / klejbenchmark-baselines
Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.
☆26Updated 2 years ago
apohllo / nlp
Natural language processing course thought at AGH University of Science and Technology
☆63Updated this week
sdadas / polish-sentence-evaluation
Evaluation of Sentence Representations in Polish
☆23Updated 2 years ago
Ermlab / PoLitBert
Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good mode…
☆35Updated 4 years ago
tblock / 10kGNAD
Ten Thousand German News Articles Dataset for Topic Classification
☆86Updated 2 years ago
explosion / floret
🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy
☆321Updated 5 months ago
kootenpv / contractions
Fixes contractions such as `you're` to `you are`
☆317Updated 2 years ago
TheophileBlard / french-sentiment-analysis-with-bert
How good is BERT ? Comparing BERT to other state-of-the-art approaches on a French sentiment analysis dataset
☆156Updated 2 years ago
Ermlab / pl-sentiment-analysis
☆30Updated 2 years ago
nlpaueb / greek-bert
A Greek edition of BERT pre-trained language model
☆149Updated last year
koaning / whatlies
Toolkit to help understand "what lies" in word embeddings. Also benchmarking!
☆477Updated 2 years ago
koaning / embetter
just a bunch of useful embeddings for scikit-learn pipelines
☆517Updated 2 weeks ago
dborrelli / chat-intents
Clustering sentence embeddings to extract message intent
☆175Updated 3 years ago
Kungbib / swedish-bert-models
☆141Updated 4 years ago
certainlyio / nordic_bert
Pre-trained Nordic models for BERT
☆174Updated 3 years ago
avidale / compress-fasttext
Tools for shrinking fastText models (in gensim format)
☆179Updated last year
flairNLP / flair-lms
Language Models for Zalando's flair library
☆61Updated 5 years ago
ian-beaver / pycontractions
Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.
☆78Updated 3 years ago
MartinoMensio / spacy-universal-sentence-encoder
Google USE (Universal Sentence Encoder) for spaCy
☆184Updated 2 years ago
koaning / doubtlab
Doubt your data, find bad labels.
☆514Updated last year
n-waves / multifit
The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model Fine-tuning" https://arxiv.org/abs/1909.04761
☆282Updated 5 years ago