siznax / wptoolsLinks
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
☆590Updated 2 years ago
Alternatives and similar repositories for wptools
Users that are interested in wptools are comparing it to the libraries listed below
Sorting:
- Wikidata client library for Python☆358Updated last year
- A simple interface to the Project Gutenberg corpus.☆329Updated 2 years ago
- A Python parser for MediaWiki wikicode☆822Updated last month
- Python tools for interacting with Wikidata☆154Updated last year
- Outputs a list of ranked DBpedia resources for a search string.☆187Updated 4 years ago
- Fact Extraction from Wikipedia Text☆537Updated 9 years ago
- Python wrapper for Wikipedia☆694Updated this week
- Entity linking system for Wikidata updated by your edits in real time☆257Updated 8 months ago
- A Python library to parse MediaWiki WikiText☆312Updated 3 months ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆99Updated 2 months ago
- ☆129Updated 3 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆260Updated last week
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆185Updated last week
- The software used to extract structured data from Wikipedia☆902Updated 6 months ago
- Streaming WARC/ARC library for fast web archive IO☆428Updated 8 months ago
- read and edit a Wikibase instance from the command line☆235Updated 3 months ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆316Updated 3 years ago
- Guidelines.☆99Updated last year
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆525Updated 10 months ago
- GERBIL - General Entity annotatoR Benchmark☆228Updated last week
- Heuristic based boilerplate removal tool☆791Updated 6 months ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆376Updated 2 years ago
- DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text. Improving Efficiency and Accuracy in Mult…☆182Updated 2 years ago
- A multilingual, cross-domain temporal tagger developed at the Database Systems Research Group at Heidelberg University.☆356Updated 2 years ago
- A set of utility scripts to process Wikipedia related data☆38Updated 3 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated 2 years ago
- A Python function to break down hashtags or compound words created by putting together multiple words☆34Updated 10 years ago
- Python client library to interface with the MediaWiki API☆333Updated last week
- Textpipe: clean and extract metadata from text☆302Updated 4 years ago
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆634Updated 4 years ago