5j9 / wikitextparser
A Python library to parse MediaWiki WikiText
☆301Updated 5 months ago
Alternatives and similar repositories for wikitextparser:
Users that are interested in wikitextparser are comparing it to the libraries listed below
- A Python parser for MediaWiki wikicode☆783Updated 2 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆98Updated 2 weeks ago
- Python client library to interface with the MediaWiki API☆325Updated this week
- Wikidata client library for Python☆349Updated 8 months ago
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆182Updated 2 months ago
- ☆168Updated 9 months ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- A Python library for working with and comparing language codes.☆345Updated 3 months ago
- Python tools for interacting with Wikidata☆152Updated last year
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 5 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆150Updated last year
- A Wikidata Python module integrating the MediaWiki API and the Wikidata SPARQL endpoint☆254Updated last year
- A python module for English lemmatization and inflection.☆265Updated last year
- The Global WordNet Association Collaborative Inter-Lingual Index☆41Updated 4 months ago
- A set of utilities for processing MediaWiki XML dump data.☆52Updated last month
- A library for fetching and reading Tatoeba's weekly exports☆22Updated last year
- A Python Wiktionary Parser☆358Updated last month
- WordNet in JSON format.☆90Updated 4 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆139Updated 3 months ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆64Updated this week
- Machine-readable Wiktionary☆76Updated 10 months ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆169Updated 3 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆91Updated 4 months ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- Simple Python Wrapper around MediaWiki API☆30Updated 2 years ago
- Entity linking system for Wikidata updated by your edits in real time☆254Updated 3 months ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆61Updated 2 weeks ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆154Updated 4 months ago
- A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See …☆657Updated 2 weeks ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago