rspeer / wordfreqLinks
Access a database of word frequencies, in various natural languages.
☆1,475Updated 5 months ago
Alternatives and similar repositories for wordfreq
Users that are interested in wordfreq are comparing it to the libraries listed below
Sorting:
- Repository for Frequency Word List Generator and processed files☆1,279Updated 3 years ago
- A modern, interlingual wordnet interface for Python☆247Updated this week
- A Python Wiktionary Parser☆360Updated 3 months ago
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆630Updated 3 years ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,288Updated 2 weeks ago
- NLP, before and after spaCy☆2,225Updated last year
- Port of Google's language-detection library to Python.☆1,804Updated 3 months ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆375Updated 2 years ago
- Multilingual text (NLP) processing toolkit☆2,343Updated last year
- A Python parser for MediaWiki wikicode☆798Updated 2 months ago
- The most accurate natural language detection library for Python, suitable for short text and mixed-language text☆1,376Updated last week
- Wiktionary dump file parser and multilingual data extractor☆927Updated this week
- The Open English WordNet☆558Updated last week
- Heuristic based boilerplate removal tool☆780Updated 3 months ago
- Crawler for linguistic corpora☆204Updated last year
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆851Updated 9 months ago
- Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.☆1,068Updated 2 years ago
- Gather modern English word frequencies from all enwiki articles.☆213Updated last year
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/☆744Updated 2 weeks ago
- spellchecking library for python☆609Updated 11 months ago
- A simple library and set of tools for parsing, modifying, and composing SRT files.☆512Updated last year
- Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.☆1,051Updated 2 months ago
- 📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.☆784Updated 2 months ago
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆827Updated last month
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Stand-alone language identification system☆2,386Updated 5 years ago
- 🦆 Contextually-keyed word vectors☆1,653Updated last month
- Pure Python spell-checker, (almost) full port of Hunspell☆291Updated last year
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆71Updated last year
- Fixes mojibake and other glitches in Unicode text, after the fact.☆3,916Updated 7 months ago