A Python library to parse MediaWiki WikiText
☆320May 15, 2025Updated 10 months ago
Alternatives and similar repositories for wikitextparser
Users that are interested in wikitextparser are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Python parser for MediaWiki wikicode☆866Mar 16, 2026Updated last week
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆108Mar 9, 2026Updated 2 weeks ago
- A MediaWiki-to-HTML parser for Python. Improved for Kitsune.☆11Jan 26, 2023Updated 3 years ago
- Python client library to interface with the MediaWiki API☆341Mar 9, 2026Updated 2 weeks ago
- ☆13Aug 20, 2021Updated 4 years ago
- A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See …☆742Updated this week
- Wiktionary dump file parser and multilingual data extractor☆1,122Mar 16, 2026Updated last week
- A tool for extracting plain text from Wikipedia dumps☆3,971May 23, 2024Updated last year
- Code for "Boosted Generative Models", AAAI 2018.☆20Dec 26, 2017Updated 8 years ago
- MediaWiki for Lightroom☆13Jan 8, 2022Updated 4 years ago
- Extraction code used to create the Dresden Web Table Corpus☆14Feb 25, 2015Updated 11 years ago
- A Utility Library for Wikipedia dumps☆33Feb 24, 2017Updated 9 years ago
- Runs some basic tests on your custom admin objects.☆13Jun 19, 2024Updated last year
- Tools to process OpenAlex raw snapshot files☆12Jan 17, 2025Updated last year
- Komoran for Python☆15Dec 26, 2014Updated 11 years ago
- ☆18Jun 12, 2023Updated 2 years ago
- A collection of open source tools and resources related to Wikibase knowledge graphs☆74Sep 9, 2025Updated 6 months ago
- Wikidata embedding☆51Nov 5, 2024Updated last year
- Mirror of https://gerrit.wikimedia.org/g/mediawiki/gadgets/RTRC.☆27Mar 5, 2026Updated 2 weeks ago
- ☆12Jul 6, 2023Updated 2 years ago
- TokenQuery (regular expressions over tokens)☆28Mar 1, 2017Updated 9 years ago
- A Python Wiktionary Parser☆370Jul 23, 2025Updated 8 months ago
- cookiecutter template for Wikimedia Toolforge tools using Flask☆25Nov 19, 2025Updated 4 months ago
- Linked Clinical Trials (LinkedCT)☆11Oct 13, 2015Updated 10 years ago
- CircleHash is a family of fast hashes -- CircleHash64f is ideal for short inputs, reaching 10GB/s starting at <64 bytes and 15GB/s at 256…☆22Updated this week
- ☆19Dec 19, 2018Updated 7 years ago
- A python wrapper for Semaphore, a Shallow Semantic Parser that identifies roles in a text.☆12Jul 2, 2013Updated 12 years ago
- Creates dictionary files from Wiktionary data☆29Aug 21, 2025Updated 7 months ago
- Python 3 library for reading and writing warc files☆21Jan 29, 2018Updated 8 years ago
- Ontology alignment between Schema.Org, Wikidata, and DBpedia☆11Oct 25, 2017Updated 8 years ago
- Highly concurrent and fast content processing for Mighty Inference Server☆10Feb 6, 2023Updated 3 years ago
- Unicode-only CJKV IDS data☆13Aug 9, 2024Updated last year
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 3 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump. Questions? https://gitter.im/idio-opensource/Lobby☆17May 20, 2022Updated 3 years ago
- R package to provide data access to OpenAlex by way of REST API☆12Jan 12, 2026Updated 2 months ago
- Exploring Few-Shot Adaptation of Language Models with Tables☆24Aug 22, 2022Updated 3 years ago
- A set of utilities for processing MediaWiki XML dump data.☆62Mar 16, 2026Updated last week
- A Node.js/browser parser for MediaWiki markup with AST☆43Updated this week
- pytorch model for cross-lingual entity linking.☆16Mar 13, 2019Updated 7 years ago