mediawiki-utilities / python-mwxmlLinks
A set of utilities for processing MediaWiki XML dump data.
☆60Updated 9 months ago
Alternatives and similar repositories for python-mwxml
Users that are interested in python-mwxml are comparing it to the libraries listed below
Sorting:
- search interface for scholarly works☆85Updated last year
- A Python library to parse MediaWiki WikiText☆317Updated 6 months ago
- Sort-friendly URI Reordering Transform (SURT) python module☆44Updated 3 months ago
- Python client library to interface with the MediaWiki API☆338Updated last week
- Command line interface to Wikidata Query Service☆55Updated last year
- Citation Classification using hybrid neural network model for Wikipedia References☆31Updated 3 years ago
- Github mirror - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)☆37Updated last year
- Wikidata lexemes presentations☆23Updated 8 months ago
- Python bot framework for Lexemes on Wikidata☆19Updated 4 years ago
- ☆40Updated 7 years ago
- read and edit a Wikibase instance from the command line☆238Updated this week
- Adding links to full text in Wikipedia references☆37Updated 5 months ago
- Perpetual Access To The Scholarly Record☆120Updated last year
- A Python module to manipulate data on a Wikibase instance (like Wikidata) through the MediaWiki Wikibase API and the Wikibase SPARQL endp…☆83Updated this week
- Imports Wiktionary's grammatical data into Wikidata☆18Updated 5 years ago
- Parses Wikipedia citation templates in Python☆17Updated 8 months ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Updated last year
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆66Updated this week
- A high performance bibliographic information service: https://biblio-glutton.readthedocs.io☆146Updated 5 months ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆104Updated 3 months ago
- [OBSOLETE] Replaced by https://gitlab.wikimedia.org/toolforge-repos/python-toolforge☆22Updated 2 years ago
- A fun tool for quickly browsing unsourced snippets on Wikipedia.☆112Updated this week
- CLI for loading Wikidata subsets (or all of it) into Elasticsearch☆71Updated 3 years ago
- DEPRECATED REPO: SEE https://gitlab.wikimedia.org/kevinpayravi/cite-unseen☆16Updated 2 months ago
- A deep learning model for extracting references from text☆30Updated 2 years ago
- A Wikidata Python module integrating the MediaWiki API and the Wikidata SPARQL endpoint☆259Updated 2 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated 2 years ago
- Text-Induced Corpus Clean-up☆20Updated 2 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated 2 years ago