dumitrescustefan / ronec
Romanian Named Entity Corpus (RONEC) version 2.0
☆62Updated 2 years ago
Alternatives and similar repositories for ronec:
Users that are interested in ronec are comparing it to the libraries listed below
- This repo is the home of Romanian Transformers.☆101Updated 2 years ago
- A novel dataset for emotion detection from Romanian text.☆17Updated last month
- Named Entity Recognition for Romanian, based on transformer models☆13Updated 3 years ago
- Text tokenization and sentence segmentation (segtok v2)☆202Updated 3 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files☆377Updated 4 months ago
- Romanian WordNet (Data + API for Python)☆51Updated 4 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆76Updated 3 years ago
- A Dutch RoBERTa-based language model☆199Updated 11 months ago
- Linguistic and stylistic complexity measures for (literary) texts☆80Updated last year
- Romanian Semantic Textual Similarity Dataset☆16Updated 2 years ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆214Updated 2 months ago
- Various utilities for processing the data.☆208Updated this week
- Jupyter notebooks for course "Computational Morphology with HFST".☆18Updated 2 years ago
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆136Updated 2 years ago
- A sentence segmenter that actually works!☆305Updated 4 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆139Updated 3 months ago
- Crawler for linguistic corpora☆205Updated last year
- xfspell — the Transformer Spell Checker☆189Updated 4 years ago
- ☆44Updated 2 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆471Updated 5 months ago
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 3 years ago
- This is a simple Python package for calculating a variety of lexical diversity indices☆73Updated last year
- Shared BERT model for 4 languages of Bulgarian, Czech, Polish and Russian. Slavic NER model.☆73Updated 3 years ago
- A multilingual lexicon of words to hurt.☆87Updated 4 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Updated last year
- Neural based model for automatic diacritics restoration.☆25Updated 6 years ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆156Updated 2 years ago
- A list of Romanian NLP Datasets☆40Updated last month