dumitrescustefan / RO-STS
Romanian Semantic Textual Similarity Dataset
☆15Updated 2 years ago
Alternatives and similar repositories for RO-STS:
Users that are interested in RO-STS are comparing it to the libraries listed below
- This repo is the home of Romanian Transformers.☆98Updated 2 years ago
- A novel dataset for emotion detection from Romanian text.☆17Updated 2 months ago
- Romanian Named Entity Corpus (RONEC) version 2.0☆61Updated 2 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good mode…☆34Updated 3 years ago
- ☆106Updated last year
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆155Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆78Updated 6 months ago
- XAI Tutorial for the Explainable AI track in the ALPS winter school 2021☆58Updated 3 years ago
- ☆50Updated 2 years ago
- A french sequence to sequence pretrained model☆57Updated 2 years ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆48Updated 3 years ago
- Evaluation of Sentence Representations in Polish☆22Updated 2 years ago
- Named Entity Recognition for Romanian, based on transformer models☆12Updated 2 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- xfspell — the Transformer Spell Checker☆188Updated 4 years ago
- Some notebooks for NLP☆189Updated last year
- Sentence transformers models for SpaCy☆107Updated last year
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆159Updated 3 months ago
- 110k Dutch Book Reviews Dataset for Sentiment Analysis☆30Updated last year
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆354Updated last year
- Norwegian Transformer Model☆115Updated last month
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆82Updated 3 years ago
- ☆31Updated 5 years ago
- Unannotated Spanish 3 Billion Words Corpora☆94Updated 2 years ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆87Updated 4 years ago
- Polish BERT☆70Updated 4 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 10 months ago
- ☆44Updated 2 years ago
- Natural language understanding benchmarks for Norwegian☆14Updated last year