orgtre / google-books-ngram-frequencyLinks
Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code
☆76Updated last year
Alternatives and similar repositories for google-books-ngram-frequency
Users that are interested in google-books-ngram-frequency are comparing it to the libraries listed below
Sorting:
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆24Updated 4 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆102Updated last month
- All the words from Google Books, sorted by frequency☆117Updated last year
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆37Updated 4 months ago
- A modern, interlingual wordnet interface for Python☆251Updated this week
- 《国际中文教育中文水平等级标准》 查询系统 Query System of Chinese Proficiency Grading Standards for International Chinese Language Education, New HSK Levels …☆31Updated last year
- A python module for English lemmatization and inflection.☆268Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆164Updated 2 weeks ago
- Verb forms dictionary☆66Updated 7 years ago
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆74Updated this week
- Offline bilingual dictionaries made using data from Wiktionary☆55Updated 10 years ago
- English Lemma Database - Compiled by Referencing British National Corpus☆31Updated 9 months ago
- Wiktionary dump file parser and multilingual data extractor☆940Updated last week
- A list of vocabulary lists☆21Updated 4 years ago
- A text file containing English words, along with the definition, parts of speech (noun,verb,adjective,etc.), and a link to the url where …☆12Updated last year
- Pipeline to generate the Standardized Project Gutenberg Corpus☆184Updated last year
- A practical python library for identifying morphemes.☆12Updated 2 years ago
- The Open English WordNet☆576Updated this week
- MFTE (Multi Feature Tagger of English) Python is the Python version based on Le Foll's MFTE written in Perl. It is extended to include se…☆25Updated last month
- Gather modern English word frequencies from all enwiki articles.☆216Updated last year
- cc-kedict: Creative Commons Korean-English Dictionary☆41Updated 3 years ago
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆72Updated 6 months ago
- Interactive visualization of Wiktionary words and etymologies.☆93Updated this week
- Multilingual sentence alignment using sentence embeddings☆120Updated 7 months ago
- List of Chinese characters ordered by frequency rank (from most common to least common). Based on Jun Da's Modern Chinese Character Frequ…☆34Updated last year
- Monolingual wordlists with pronunciation information in IPA☆632Updated last month
- Helsinki Finite-State Technology (library and application suite)☆131Updated last month
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆21Updated last year
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆39Updated 8 months ago
- Jupyter notebooks for course "Computational Morphology with HFST".☆18Updated 2 years ago