siznax / wptools
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
☆576Updated last year
Alternatives and similar repositories for wptools:
Users that are interested in wptools are comparing it to the libraries listed below
- Wikidata client library for Python☆346Updated 7 months ago
- Python wrapper for Wikipedia☆633Updated last week
- Python client library to interface with the MediaWiki API☆325Updated last month
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆181Updated last month
- A Pythonic wrapper for the Wikipedia API☆2,924Updated 9 months ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆371Updated 2 years ago
- read and edit a Wikibase instance from the command line☆231Updated this week
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆253Updated 6 months ago
- A Python parser for MediaWiki wikicode☆779Updated last month
- Python tools for interacting with Wikidata☆151Updated last year
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆629Updated 3 years ago
- Fact Extraction from Wikipedia Text☆531Updated 8 years ago
- Heuristic based boilerplate removal tool☆753Updated this week
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,187Updated 3 weeks ago
- Entity linking system for Wikidata updated by your edits in real time☆252Updated 3 months ago
- A Python function to break down hashtags or compound words created by putting together multiple words☆33Updated 9 years ago
- ☆168Updated 8 months ago
- A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See …☆652Updated this week
- Streaming WARC/ARC library for fast web archive IO☆401Updated 2 months ago
- Accurately find/replace/remove emojis in text strings☆160Updated last year
- Language independent truecaser in Python.☆160Updated 3 years ago
- a collection of functions that measure the readability of a given body of text☆191Updated 7 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated last year
- Tools for parsing and querying Wikimedia Foundation pageview data from both static dumps and the online API.☆65Updated 3 years ago
- Python wrapper for Stanford CoreNLP☆354Updated 4 years ago
- analyze text with empath☆322Updated 7 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆150Updated last year
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆167Updated 2 months ago
- Elegant and Easy Tweet Preprocessing in Python☆306Updated last year
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 4 months ago