siznax / wptoolsLinks
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
☆586Updated 2 years ago
Alternatives and similar repositories for wptools
Users that are interested in wptools are comparing it to the libraries listed below
Sorting:
- Wikidata client library for Python☆360Updated last month
- A Python function to break down hashtags or compound words created by putting together multiple words☆33Updated 10 years ago
- Python tools for interacting with Wikidata☆156Updated 2 years ago
- A simple interface to the Project Gutenberg corpus.☆330Updated 2 years ago
- Python wrapper for Wikipedia☆702Updated last week
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆186Updated last month
- Fact Extraction from Wikipedia Text☆538Updated 9 years ago
- Textpipe: clean and extract metadata from text☆302Updated 4 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆104Updated last month
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆260Updated 2 months ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 5 years ago
- Cleans Reddit Text Data☆84Updated 5 years ago
- A Python library to parse MediaWiki WikiText☆315Updated 5 months ago
- Entity linking system for Wikidata updated by your edits in real time☆256Updated 11 months ago
- A Python parser for MediaWiki wikicode☆838Updated 4 months ago
- Default English stopword lists from many different sources☆309Updated 2 years ago
- A python module for English lemmatization and inflection.☆272Updated 2 years ago
- read and edit a Wikibase instance from the command line☆237Updated last week
- DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text. Improving Efficiency and Accuracy in Mult…☆182Updated 2 years ago
- Python wrapper for Stanford CoreNLP's SUTime☆158Updated 2 years ago
- Quickly extract multi-word phrases from a corpus☆194Updated 5 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆76Updated 5 months ago
- analyze text with empath☆337Updated 8 years ago
- Guidelines.☆100Updated last year
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆632Updated 4 years ago
- Tools for parsing and querying Wikimedia Foundation pageview data from both static dumps and the online API.☆66Updated 3 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆527Updated last year
- Heuristic based boilerplate removal tool☆801Updated 8 months ago
- AmbiverseNLU: A Natural Language Understanding suite by Max Planck Institute for Informatics☆212Updated last year
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆170Updated 3 years ago