dumitrescustefan / ronec
Romanian Named Entity Corpus (RONEC) version 2.0
☆63Updated 2 years ago
Alternatives and similar repositories for ronec:
Users that are interested in ronec are comparing it to the libraries listed below
- This repo is the home of Romanian Transformers.☆101Updated 2 years ago
- Compound splitter for German☆104Updated 5 years ago
- Named Entity Recognition for Romanian, based on transformer models☆13Updated 3 years ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆157Updated 2 years ago
- A novel dataset for emotion detection from Romanian text.☆17Updated 2 months ago
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆25Updated 2 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆140Updated 4 months ago
- Plan and train German transformer models.☆23Updated 4 years ago
- Romanian Semantic Textual Similarity Dataset☆16Updated 2 years ago
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆137Updated 2 years ago
- UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files☆377Updated 5 months ago
- German Morphological Analyzer☆47Updated 3 years ago
- Shared BERT model for 4 languages of Bulgarian, Czech, Polish and Russian. Slavic NER model.☆73Updated 3 years ago
- A python wrapper for the multilingual temporal tagger HeidelTime.☆26Updated 3 years ago
- A Dutch RoBERTa-based language model☆201Updated last year
- Unsupervised Language Model Pre-training for French☆248Updated 2 years ago
- Romanian WordNet (Data + API for Python)☆51Updated 4 years ago
- Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tenso…☆236Updated 8 months ago
- Text tokenization and sentence segmentation (segtok v2)☆201Updated 3 years ago
- 🏖TagEditor - Annotation tool for spaCy☆193Updated 2 years ago
- A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, …☆34Updated 6 years ago
- A sentence segmenter that actually works!☆306Updated 4 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆154Updated 5 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 9 months ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆478Updated 5 months ago
- A tool for automatic spelling normalization☆20Updated 4 years ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆112Updated 11 months ago
- 110k Dutch Book Reviews Dataset for Sentiment Analysis☆29Updated last year