earwig / mwparserfromhell
A Python parser for MediaWiki wikicode
☆779Updated last month
Alternatives and similar repositories for mwparserfromhell:
Users that are interested in mwparserfromhell are comparing it to the libraries listed below
- A Python library to parse MediaWiki WikiText☆299Updated 4 months ago
- Python client library to interface with the MediaWiki API☆325Updated 3 weeks ago
- A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See …☆651Updated this week
- Wikidata client library for Python☆345Updated 7 months ago
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆181Updated last month
- A Wikidata Python module integrating the MediaWiki API and the Wikidata SPARQL endpoint☆250Updated last year
- Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis☆576Updated last year
- Streaming WARC/ARC library for fast web archive IO☆397Updated 2 months ago
- This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wik…☆259Updated 8 years ago
- A modern, interlingual wordnet interface for Python☆232Updated 2 weeks ago
- Heuristic based boilerplate removal tool☆747Updated 9 months ago
- A machine learning tool for fishing entities☆258Updated this week
- read and edit a Wikibase instance from the command line☆230Updated this week
- ☆168Updated 8 months ago
- Fast multi-keyword search engine for text strings☆252Updated 5 months ago
- A tool for learning vector representations of words and entities from Wikipedia☆948Updated 9 months ago
- A python module for English lemmatization and inflection.☆265Updated last year
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆253Updated 5 months ago
- Entity linking system for Wikidata updated by your edits in real time☆250Updated 2 months ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆371Updated 2 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 4 months ago
- Python library for reading and writing warc files☆239Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆238Updated 2 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆138Updated 2 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆97Updated this week
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆976Updated 11 months ago
- Lexical database of any language☆176Updated 2 years ago
- Python wrapper for Wikipedia☆631Updated this week
- (Official repo for pypi package) Python bindings for the Hunspell spellchecker engine☆186Updated 4 years ago
- Hy-phen-ation made easy☆207Updated this week