dumitrescustefan / RoWordNet
Romanian WordNet (Data + API for Python)
☆49Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for RoWordNet
- Romanian Named Entity Corpus (RONEC) version 2.0☆60Updated 2 years ago
- This repo is the home of Romanian Transformers.☆93Updated 2 years ago
- A list of Romanian NLP Datasets☆31Updated last month
- This is a neural spell checker☆60Updated last year
- A sentence segmenter that actually works!☆302Updated 4 years ago
- Fast and accurate spell correction library☆76Updated 2 years ago
- NeuSpell: A Neural Spelling Correction Toolkit☆671Updated last year
- Bitextor generates translation memories from multilingual websites☆291Updated last week
- Punctuation restoration and spell correction experiments.☆248Updated 3 years ago
- 🏖TagEditor - Annotation tool for spaCy☆187Updated 2 years ago
- Neural based model for automatic diacritics restoration.☆22Updated 6 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- xfspell — the Transformer Spell Checker☆187Updated 4 years ago
- Text and Punctuation correction with Deep Learning☆129Updated 4 years ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆409Updated last month
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆351Updated last year
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆157Updated last month
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆220Updated last year
- Automatic extraction of edited sentences from text edition histories.☆81Updated 2 years ago
- 📃Language Model based sentences scoring library☆303Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆230Updated 2 years ago
- Crowd sourced training data for Rasa NLU models☆199Updated 10 months ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated last year
- Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing☆555Updated 2 weeks ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆112Updated 6 months ago
- Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gen…☆12Updated 10 months ago
- Named Entity Recognition for Romanian, based on transformer models☆12Updated 2 years ago
- Text tokenization and sentence segmentation (segtok v2)☆203Updated 2 years ago
- A tool that locates, downloads, and extracts machine translation corpora☆147Updated 5 months ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆312Updated last month