rspeer / wordfreqLinks
Access a database of word frequencies, in various natural languages.
☆1,549Updated 9 months ago
Alternatives and similar repositories for wordfreq
Users that are interested in wordfreq are comparing it to the libraries listed below
Sorting:
- The Open English WordNet☆634Updated last week
- A Python Wiktionary Parser☆364Updated 2 months ago
- Wiktionary dump file parser and multilingual data extractor☆1,015Updated last week
- A Python parser for MediaWiki wikicode☆835Updated 3 months ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,321Updated last month
- Machine-readable lists of lemma-token pairs in 23 languages.☆343Updated 3 years ago
- Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.☆798Updated last week
- A modern, interlingual wordnet interface for Python☆263Updated last month
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆632Updated 4 years ago
- Gather modern English word frequencies from all enwiki articles.☆225Updated last year
- Repository for Frequency Word List Generator and processed files☆1,362Updated 3 years ago
- ☆852Updated 2 years ago
- All languages stopwords collection☆458Updated last year
- Heuristic based boilerplate removal tool☆796Updated 7 months ago
- English Lemma Database - Compiled by Referencing British National Corpus☆32Updated last year
- Python stemming library using snowball stemmers☆264Updated last month
- Compact Language Detector 2☆875Updated 4 years ago
- SCOWL (and friends).☆446Updated 2 months ago
- All the words from Google Books, sorted by frequency☆118Updated 2 years ago
- The Open Source Dictionary☆574Updated 6 months ago
- hand-written dictionaries from the FreeDict project☆437Updated 2 months ago
- A python module for English lemmatization and inflection.☆272Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆176Updated 4 months ago
- A Python library to parse MediaWiki WikiText☆314Updated 4 months ago
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆159Updated 9 months ago
- Article extraction benchmark: dataset and evaluation scripts☆331Updated 2 weeks ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- Python wrapper for Wikipedia☆698Updated this week
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆107Updated 2 weeks ago
- List of common stop words in various languages.☆337Updated 3 years ago