5j9 / wikitextparser
A Python library to parse MediaWiki WikiText
☆301Updated 4 months ago
Alternatives and similar repositories for wikitextparser:
Users that are interested in wikitextparser are comparing it to the libraries listed below
- A Python parser for MediaWiki wikicode☆782Updated 2 months ago
- Python client library to interface with the MediaWiki API☆325Updated last month
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆98Updated last week
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆27Updated 5 years ago
- Python tools for interacting with Wikidata☆152Updated last year
- Wikidata client library for Python☆348Updated 8 months ago
- A Wikidata Python module integrating the MediaWiki API and the Wikidata SPARQL endpoint☆254Updated last year
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆150Updated last year
- A modern, interlingual wordnet interface for Python☆233Updated last week
- ☆168Updated 9 months ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆139Updated 3 months ago
- A Python Wiktionary Parser☆357Updated 2 weeks ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆153Updated 3 months ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆241Updated 2 years ago
- The Global WordNet Association Collaborative Inter-Lingual Index☆41Updated 4 months ago
- unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language prefere…☆69Updated 2 years ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆314Updated 2 weeks ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- A Python library for working with and comparing language codes.☆344Updated 3 months ago
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆56Updated 2 weeks ago
- This packages up data for the Open Multilingual Wordnet☆46Updated this week
- Compute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.☆58Updated 4 months ago
- A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See …☆654Updated this week
- The Open Multilingual Wordnet☆61Updated 10 months ago
- Various utilities for processing the data.☆208Updated this week
- Sentence aligner☆110Updated 3 years ago
- A set of utilities for processing MediaWiki XML dump data.☆51Updated last month
- Cython wrapper on Hunspell Dictionary☆67Updated 8 months ago
- Simple Python Wrapper around MediaWiki API☆30Updated 2 years ago
- Wiktionary dump file parser and multilingual data extractor☆865Updated this week