lpmi-13 / machine_readable_wordlistsLinks
A collection of word lists in machine readable, web-native (.yml and .json) format
☆24Updated 2 years ago
Alternatives and similar repositories for machine_readable_wordlists
Users that are interested in machine_readable_wordlists are comparing it to the libraries listed below
Sorting:
- Open Language Profiles — English profile datasets from CEFR-J☆167Updated 5 years ago
- A Python Wiktionary Parser☆371Updated 6 months ago
- Jupyter notebooks for course "Computational Morphology with HFST".☆19Updated 3 years ago
- A list of vocabulary lists☆22Updated 5 years ago
- A multilingual parallel corpus created from translations of the Bible.☆191Updated 8 months ago
- Sentence aligner☆124Updated 4 years ago
- Romanian Named Entity Corpus (RONEC) version 2.0☆66Updated 3 years ago
- An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.☆31Updated last year
- Machine-Translation-based sentence alignment tool for parallel text☆314Updated 4 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆515Updated last year
- Wiktionary dump file parser and multilingual data extractor☆1,088Updated 3 weeks ago
- Curated corpus of parallel data derived from versions of the Bible provided by eBible.org.☆80Updated 8 months ago
- ☆33Updated last year
- ☆67Updated 5 months ago
- Linguistic and stylistic complexity measures for (literary) texts☆84Updated 2 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆39Updated 2 months ago
- Gather modern English word frequencies from all enwiki articles.☆228Updated last year
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆115Updated last year
- Lexical database for ~70k English words with morphological variables☆50Updated 4 years ago
- Various utilities for processing the data.☆217Updated this week
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆319Updated 2 weeks ago
- NLP tools for Kazakh language☆35Updated 3 years ago
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆49Updated 3 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆386Updated 2 years ago
- Python framework for processing Universal Dependencies data☆59Updated last week
- FreeLing project source code☆260Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆255Updated 3 years ago
- ☆15Updated 7 years ago
- Framework for training dependency parsing models.☆12Updated last year
- Neural Adaptive Machine Translation that adapts to context and learns from corrections.☆350Updated 3 years ago