harshnative / words-dataset
over 6_00_000 english words data set arranged with each words frequency
β15Updated 3 years ago
Alternatives and similar repositories for words-dataset:
Users that are interested in words-dataset are comparing it to the libraries listed below
- English Lemma Database - Compiled by Referencing British National Corpusβ30Updated 6 months ago
- π β’ 5050 most frequent words in 109 languagesβ42Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.β243Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ154Updated 4 months ago
- Lightweight string similarity function for javascriptβ98Updated last year
- WordNet in JSON format.β90Updated 4 years ago
- RosaeNLG is a Natural Language Generation library for node.js and browser rendering, based on the Pug template engine.β99Updated 3 months ago
- A Python library for detecting and filtering profanityβ163Updated 4 years ago
- A modern, interlingual wordnet interface for Pythonβ236Updated last week
- JS Trie / DAWG classesβ30Updated last year
- Split {Japanese, English} text into sentences.β124Updated last year
- Difference English sentences via Liechtenstein distance, calculate word error rate, and list out word by word differencesβ10Updated 4 years ago
- Gather modern English word frequencies from all enwiki articles.β212Updated last year
- NLP system for predicting the reading difficulty level of a text in terms of its CEFR level.β50Updated 3 months ago
- Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible fβ¦β211Updated 4 months ago
- A list of vocabulary listsβ21Updated 4 years ago
- Converts English text to IPA notationβ380Updated last year
- PyMultiDictionary is a dictionary module that gets meanings, translations, synonyms, and antonyms of words in 20 different languagesβ50Updated last week
- An NLP pipeline for Hebrewβ37Updated 3 weeks ago
- πͺ Create synchronized replicas of a DOM element using Reactβ13Updated last week
- A repository of words in multiple languages sorted by their frequencyβ11Updated last year
- Unicode to ASCII transliteration - C Elixir Go Java JS Julia PHP Python Ruby Rust Shell .NETβ301Updated last month
- A parallel corpus of Sorani, Kurmanji and Englishβ11Updated 4 years ago
- An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also supporting some non-Uralic languages such as Spanβ¦β77Updated 4 months ago
- CLDR text segmentation for JavaScriptβ38Updated 11 months ago
- Stylometry library for Burrows' Delta methodβ36Updated 11 months ago
- Serve Next.js requests via Fastifyβ13Updated last year
- Machine-readable lists of lemma-token pairs in 23 languages.β335Updated 3 years ago
- Verb forms dictionaryβ65Updated 7 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)β28Updated 3 years ago