AndyTheFactory / romanian-nlp-datasets
A list of Romanian NLP Datasets
☆41Updated last month
Alternatives and similar repositories for romanian-nlp-datasets:
Users that are interested in romanian-nlp-datasets are comparing it to the libraries listed below
- This repo is the home of Romanian Transformers.☆101Updated 2 years ago
- A novel dataset for emotion detection from Romanian text.☆17Updated last month
- Romanian Named Entity Corpus (RONEC) version 2.0☆62Updated 2 years ago
- A list of Natural Language Processing Tools for Romanian☆29Updated 4 years ago
- Romanian Semantic Textual Similarity Dataset☆16Updated 2 years ago
- A Python library for calculating a large variety of metrics from text☆334Updated 3 months ago
- Romanian WordNet (Data + API for Python)☆51Updated 4 years ago
- Neural based model for automatic diacritics restoration.☆25Updated 6 years ago
- The robust European language model benchmark.☆94Updated this week
- A project for training foundational Danish language model☆72Updated this week
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…☆336Updated 7 months ago
- A python package for text preprocessing task in natural language processing.☆63Updated 2 years ago
- This repository contains EmoBank, a large-scale text corpus manually annotated with emotion according to the psychological Valence-Arousa…☆203Updated 2 years ago
- This repository provides usage examples for the Python module Newspaper3k.☆146Updated last year
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆214Updated 2 months ago
- Named Entity Recognition for Romanian, based on transformer models☆13Updated 3 years ago
- Catalog of abusive language data (PLoS 2020)☆309Updated 9 months ago
- A very simple news crawler with a funny name☆362Updated last week
- SpanMarker for Named Entity Recognition☆424Updated 2 months ago
- Some notebooks for NLP☆199Updated last year
- Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good mode…☆34Updated 3 years ago
- analyze text with empath☆324Updated 7 years ago
- A curated list of resources such as tools and datasets useful for the processing of Slovak language☆19Updated 2 weeks ago
- ☆158Updated 9 months ago
- This repository contains the HiNER dataset released with our paper at LREC 2022☆14Updated last year
- Fine-tuning Open-Source LLMs for Adaptive Machine Translation☆76Updated 3 weeks ago
- A Scandinavian Benchmark for sentence embeddings☆36Updated last month
- [GSOC] Greek language support for spacy.io python NLP software☆101Updated 6 years ago
- Pre-trained Nordic models for BERT☆168Updated 3 years ago
- BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them t…☆137Updated 9 months ago