5j9 / wikitextparser
A Python library to parse MediaWiki WikiText
☆307Updated 6 months ago
Alternatives and similar repositories for wikitextparser:
Users that are interested in wikitextparser are comparing it to the libraries listed below
- A Python parser for MediaWiki wikicode☆790Updated last month
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆99Updated this week
- Wikidata client library for Python☆355Updated 9 months ago
- A Python Wiktionary Parser☆358Updated 2 months ago
- Various utilities for processing the data.☆209Updated this week
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 6 months ago
- Python client library to interface with the MediaWiki API☆326Updated last month
- ☆169Updated last month
- WordNet in JSON format.☆92Updated 4 years ago
- A set of utilities for processing MediaWiki XML dump data.☆53Updated 2 months ago
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆28Updated 5 years ago
- A modern, interlingual wordnet interface for Python☆244Updated this week
- A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See …☆669Updated last week
- A Wikidata Python module integrating the MediaWiki API and the Wikidata SPARQL endpoint☆255Updated last year
- Streaming WARC/ARC library for fast web archive IO☆411Updated 4 months ago
- CLI for loading Wikidata subsets (or all of it) into Elasticsearch☆70Updated 3 years ago
- Cython wrapper on Hunspell Dictionary☆66Updated 10 months ago
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆183Updated 3 months ago
- The Global WordNet Association Collaborative Inter-Lingual Index☆42Updated 5 months ago
- Universal Dependencies online documentation☆283Updated this week
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Python tools for interacting with Wikidata☆153Updated last year
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆56Updated last week
- Python Finite-State Toolkit☆54Updated 2 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆154Updated 5 months ago
- German part-of-speech dictionary☆45Updated last year
- Library for unit extraction - fork of quantulum for python3☆138Updated 10 months ago
- Wiktionary dump file parser and multilingual data extractor☆900Updated this week
- a collection of functions that measure the readability of a given body of text☆192Updated 7 years ago
- Tools for parsing and querying Wikimedia Foundation pageview data from both static dumps and the online API.☆65Updated 3 years ago