siznax / wptoolsLinks
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
☆591Updated 2 years ago
Alternatives and similar repositories for wptools
Users that are interested in wptools are comparing it to the libraries listed below
Sorting:
- Wikidata client library for Python☆363Updated 2 months ago
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆631Updated 4 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆261Updated 5 months ago
- A simple interface to the Project Gutenberg corpus.☆331Updated 3 years ago
- Python tools for interacting with Wikidata☆160Updated 2 years ago
- ☆129Updated 4 years ago
- Python wrapper for Wikipedia☆711Updated last month
- Entity linking system for Wikidata updated by your edits in real time☆258Updated last month
- A Python parser for MediaWiki wikicode☆856Updated 6 months ago
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆186Updated last week
- Filter and format a newline-delimited JSON stream of Wikibase entities☆105Updated 4 months ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆528Updated last year
- Heuristic based boilerplate removal tool☆810Updated 10 months ago
- A machine learning tool for fishing entities☆270Updated 8 months ago
- Textpipe: clean and extract metadata from text☆302Updated 4 years ago
- Quickly extract multi-word phrases from a corpus☆195Updated 5 years ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆378Updated 3 years ago
- Python wrapper for Stanford CoreNLP's SUTime☆162Updated 2 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆76Updated 7 months ago
- Fact Extraction from Wikipedia Text☆537Updated 9 years ago
- Default English stopword lists from many different sources☆311Updated 2 years ago
- A Python library to parse MediaWiki WikiText☆315Updated 8 months ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 5 years ago
- Cleans Reddit Text Data☆84Updated 5 years ago
- All languages stopwords collection☆476Updated 2 years ago
- A python module for English lemmatization and inflection.☆274Updated 2 years ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆316Updated 3 years ago
- A tool for learning vector representations of words and entities from Wikipedia☆964Updated last year
- Collection of tools for building diachronic/historical word vectors☆443Updated 2 years ago
- Language independent truecaser in Python.☆159Updated 4 years ago