orgtre / top-open-subtitles-sentences
Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code
☆27Updated last month
Alternatives and similar repositories for top-open-subtitles-sentences:
Users that are interested in top-open-subtitles-sentences are comparing it to the libraries listed below
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆23Updated 7 years ago
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆62Updated last year
- Global ASP - African Storybook Project for the World☆14Updated 3 months ago
- British English pronunciation dictionary☆92Updated 7 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- Morphological Dictionaries for German Language☆28Updated 6 years ago
- 🏆 • 5050 most frequent words in 109 languages☆42Updated 2 years ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Aksharamukha Python Library☆44Updated last month
- cc-kedict: Creative Commons Korean-English Dictionary☆41Updated 3 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆118Updated 4 years ago
- 📈 A forced aligner intended for synchronization of narrated text☆91Updated 2 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆89Updated last year
- Fifteen Thousand Useful Phrases, by Greenville Kleiser☆54Updated 8 years ago
- A modern, interlingual wordnet interface for Python☆233Updated last week
- Small example scripts for working with Japanese texts in Python☆26Updated 5 years ago
- A list of vocabulary lists☆21Updated 4 years ago
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆23Updated 4 years ago
- A corpus of short answers written by learners of English and graded with CEFR levels☆10Updated 3 years ago
- Gather modern English word frequencies from all enwiki articles.☆211Updated last year
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆98Updated last week
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆89Updated this week
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆64Updated this week
- ☆15Updated last year
- Offline etymological dictionary based on Wiktionary data☆21Updated 3 years ago
- Lists of most-frequently-used english words / nouns / verbs etc.☆57Updated 4 years ago
- Practice Chinese language grammar☆16Updated 3 years ago
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆19Updated last year