mediawiki-utilities / python-mwxml
A set of utilities for processing MediaWiki XML dump data.
☆53Updated 2 months ago
Alternatives and similar repositories for python-mwxml:
Users that are interested in python-mwxml are comparing it to the libraries listed below
- Sort-friendly URI Reordering Transform (SURT) python module☆42Updated 8 months ago
- Citation Classification using hybrid neural network model for Wikipedia References☆28Updated 2 years ago
- Github mirror - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)☆36Updated 10 months ago
- Python tools for interacting with Wikidata☆153Updated last year
- A tool to analyse, browse and query Wikidata☆83Updated 6 months ago
- An algorithm to compute token-level provenance and changes for Wiki revisioned content. Tested at +95% accuracy for EN.Wikipedia.☆31Updated 6 years ago
- A Knowledge Base for research software relying on large-scale text mining and curated knowledge sources☆16Updated last year
- Simple Python Wrapper around MediaWiki API☆30Updated 2 years ago
- The official repo for the QuickStatements PHP/HTML/JS interface☆46Updated 2 weeks ago
- Adding links to full text in Wikipedia references☆37Updated last year
- Wikidata service to help create or link author items to published articles☆33Updated 2 months ago
- https://en.wikipedia.org/wiki/User:SuperHamster/CiteUnseen☆16Updated last year
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 6 months ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- Wikidata lexemes presentations☆23Updated 2 weeks ago
- A Python module to manipulate data on a Wikibase instance (like Wikidata) through the MediaWiki Wikibase API and the Wikibase SPARQL endp…☆77Updated 2 weeks ago
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆26Updated 8 months ago
- A deep learning model for extracting references from text☆28Updated last year
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- 🌸 Train floret vectors☆18Updated last year
- Entity linking system for Wikidata updated by your edits in real time☆254Updated 4 months ago
- Parses Wikipedia citation templates in Python☆16Updated 3 weeks ago
- Tools for querying various name-based gender inference services and evaluate them.☆10Updated 2 years ago
- CLI for loading Wikidata subsets (or all of it) into Elasticsearch☆70Updated 3 years ago
- search interface for scholarly works☆85Updated 8 months ago
- Imports Wiktionary's grammatical data into Wikidata☆17Updated 5 years ago
- Updates Wikidata entries using metadata from github☆44Updated 3 weeks ago
- ☆38Updated 6 years ago
- Tool for generating filtered Wikidata RDF exports☆42Updated 3 years ago