orgtre / google-books-ngram-frequencyLinks
Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code
☆96Updated 2 years ago
Alternatives and similar repositories for google-books-ngram-frequency
Users that are interested in google-books-ngram-frequency are comparing it to the libraries listed below
Sorting:
- Gather modern English word frequencies from all enwiki articles.☆227Updated last year
- All the words from Google Books, sorted by frequency☆119Updated 2 years ago
- The Open English WordNet☆659Updated 2 weeks ago
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆48Updated 9 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆108Updated last week
- A Python Wiktionary Parser☆367Updated 3 months ago
- Open Language Profiles — English profile datasets from CEFR-J☆153Updated 5 years ago
- A modern, interlingual wordnet interface for Python☆272Updated this week
- The World Atlas of Language Structures☆69Updated last year
- Lists of most-frequently-used english words / nouns / verbs etc.☆89Updated 5 years ago
- A list of vocabulary lists☆22Updated 5 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆61Updated 10 years ago
- 《国际中文教育中文水平等级标准》 查询系统 Query System of Chinese Proficiency Grading Standards for International Chinese Language Education, New HSK Levels …☆39Updated 2 weeks ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆51Updated 2 years ago
- A list of awesome Machine Translation frameworks, libraries, software and papers☆192Updated last year
- Machine-readable lists of lemma-token pairs in 23 languages.☆349Updated 3 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆96Updated last year
- Wiktionary dump file parser and multilingual data extractor☆1,038Updated this week
- Fifteen Thousand Useful Phrases, by Greenville Kleiser☆55Updated 9 years ago
- English Lemma Database - Compiled by Referencing British National Corpus☆33Updated last year
- 30,000 most common English words with Chinese dictionary explanations in order of frequency.☆192Updated 5 years ago
- 🏆 • 5050 most frequent words in 109 languages☆47Updated 2 years ago
- Verb forms dictionary☆67Updated 8 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆31Updated 4 months ago
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆27Updated 3 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆56Updated 4 years ago
- The Unicode Cookbook for Linguists☆56Updated 5 years ago
- A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.☆36Updated this week
- Multilingual sentence alignment using sentence embeddings☆130Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆51Updated 2 years ago