david47k / top-english-wordlistsLinks
Lists of most-frequently-used english words / nouns / verbs etc.
☆65Updated 4 years ago
Alternatives and similar repositories for top-english-wordlists
Users that are interested in top-english-wordlists are comparing it to the libraries listed below
Sorting:
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆35Updated 3 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆101Updated 2 weeks ago
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆71Updated last year
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆24Updated 8 years ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆65Updated 9 months ago
- Machine-readable lists of lemma-token pairs in 23 languages.☆340Updated 3 years ago
- A modern, interlingual wordnet interface for Python☆247Updated this week
- Gather modern English word frequencies from all enwiki articles.☆213Updated last year
- A list of vocabulary lists☆21Updated 4 years ago
- Scrapes Google Books Ngram data to create a long word list☆13Updated last year
- A list of words from the SUBTLEX movie subtitles corpus, sorted by frequency.☆33Updated 5 years ago
- This repo contains a list of the 44,998 most common Japanese words in order of frequency, as determined by the University of Leeds Corpus…☆73Updated 6 years ago
- OPUS-CAT is a collection of software which make it possible to OPUS-MT neural machine translation models in professional translation. OPU…☆79Updated 4 months ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆30Updated 3 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆52Updated last month
- English Lemma Database - Compiled by Referencing British National Corpus☆31Updated 8 months ago
- List of English synonyms and antonyms parsed from the public domain book of James C. Fernald, 1896☆43Updated 6 years ago
- English vocabulary word list with definition, audio, pronounciation and example☆25Updated last year
- PyDictionary is an offline English dictionary made using Python along with the Wordnet Lexical Database and Enchant Spell Dictionary. The…☆19Updated 4 years ago
- Sentence aligner☆113Updated 4 years ago
- All the words from Google Books, sorted by frequency☆116Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆246Updated 2 years ago
- List of ~275,000 English words☆226Updated last year
- The Open Parallel Corpus☆72Updated 2 months ago
- Verb forms dictionary☆66Updated 7 years ago
- A list of awesome Machine Translation frameworks, libraries, software and papers☆192Updated 10 months ago
- Translation demonstrator☆33Updated 5 years ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆108Updated 6 years ago
- 🏆 • 5050 most frequent words in 109 languages☆42Updated 2 years ago
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioning☆32Updated last week