orgtre / google-books-ngram-frequencyLinks
Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code
☆101Updated 2 years ago
Alternatives and similar repositories for google-books-ngram-frequency
Users that are interested in google-books-ngram-frequency are comparing it to the libraries listed below
Sorting:
- All the words from Google Books, sorted by frequency☆120Updated 2 years ago
- Gather modern English word frequencies from all enwiki articles.☆227Updated last year
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆108Updated 3 weeks ago
- The World Atlas of Language Structures☆72Updated last year
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆51Updated 10 months ago
- The Open English WordNet☆677Updated last week
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆27Updated 4 months ago
- A modern, interlingual wordnet interface for Python☆276Updated last week
- Monolingual wordlists with pronunciation information in IPA☆699Updated 7 months ago
- Verb forms dictionary☆67Updated 8 years ago
- British English pronunciation dictionary☆96Updated 8 years ago
- 30,000 most common English words with Chinese dictionary explanations in order of frequency.☆194Updated 5 years ago
- Wiktionary dump file parser and multilingual data extractor☆1,057Updated this week
- Lists of most-frequently-used english words / nouns / verbs etc.☆93Updated 5 years ago
- A list of vocabulary lists☆22Updated 5 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆24Updated 8 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆157Updated 5 years ago
- Machine-readable lists of lemma-token pairs in 23 languages.☆353Updated 3 years ago
- English Lemma Database - Compiled by Referencing British National Corpus☆34Updated last year
- 《国际中文教育中文水平等级标准》 查询系统 Query System of Chinese Proficiency Grading Standards for International Chinese Language Education, New HSK Levels …☆40Updated last month
- Chinese language vocabulary graph generation. Python/Flask tool that performs dictionary search and analysis on Chinese Hanzi characters.…☆155Updated 2 years ago
- Fifteen Thousand Useful Phrases, by Greenville Kleiser☆56Updated 9 years ago
- A Python Wiktionary Parser☆367Updated 5 months ago
- A list of awesome Machine Translation frameworks, libraries, software and papers☆194Updated last year
- HSK 3.0 Vocabulary Lists (words and characters)☆90Updated 2 years ago
- Converts English text to IPA notation☆393Updated 2 years ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆203Updated last year
- The Unicode Cookbook for Linguists☆56Updated 5 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆94Updated last week
- Multilingual sentence alignment using sentence embeddings☆131Updated last year