dumitrescustefan / RoWordNet
Romanian WordNet (Data + API for Python)
☆49Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for RoWordNet
- This repo is the home of Romanian Transformers.☆93Updated 2 years ago
- Romanian Named Entity Corpus (RONEC) version 2.0☆60Updated last year
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated last year
- ✔️Contextual word checker for better suggestions (not actively maintained)☆409Updated last month
- Automatically constructing corpus for automatic speech recognition from YouTube videos☆153Updated 4 years ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆220Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆230Updated 2 years ago
- A list of Romanian NLP Datasets☆30Updated 3 weeks ago
- xfspell — the Transformer Spell Checker☆187Updated 4 years ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆310Updated 3 weeks ago
- LASER multilingual sentence embeddings as a pip package☆225Updated last year
- Punctuation restoration and spell correction experiments.☆248Updated 3 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆350Updated last year
- The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model Fine-tuning" https://arxiv.org/abs/1909.04761☆284Updated 4 years ago
- Fast and accurate spell correction library☆76Updated 2 years ago
- Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing☆554Updated last week
- A python module for English lemmatization and inflection.☆261Updated last year
- Text tokenization and sentence segmentation (segtok v2)☆203Updated 2 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆249Updated 2 months ago
- This is a neural spell checker☆60Updated last year
- A sentence segmenter that actually works!☆302Updated 4 years ago
- A novel dataset for emotion detection from Romanian text.☆15Updated 2 weeks ago
- Named Entity Recognition for Romanian, based on transformer models☆12Updated 2 years ago
- A tool that locates, downloads, and extracts machine translation corpora☆147Updated 5 months ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆149Updated last year
- CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed)☆348Updated 3 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing☆734Updated 3 weeks ago
- Complimentary code for our paper Automatic punctuation restoration with BERT models☆48Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆144Updated this week