tatuylonen / wikitextprocessorLinks
Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. For data extraction, bulk syntax checking, error detection, and offline formatting.
☆107Updated last month
Alternatives and similar repositories for wikitextprocessor
Users that are interested in wikitextprocessor are comparing it to the libraries listed below
Sorting:
- A modern, interlingual wordnet interface for Python☆276Updated 3 weeks ago
- A Python Wiktionary Parser☆368Updated 5 months ago
- A Python library to parse MediaWiki WikiText☆316Updated 7 months ago
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆33Updated 6 years ago
- This packages up data for the Open Multilingual Wordnet☆59Updated 6 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆180Updated 6 months ago
- The Open Multilingual Wordnet☆66Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆52Updated 2 years ago
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- Machine-readable Wiktionary☆77Updated last year
- A list of vocabulary lists☆22Updated 5 years ago
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆162Updated last year
- Sentence aligner☆122Updated 4 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆78Updated 3 weeks ago
- A multilingual parallel corpus created from translations of the Bible.☆191Updated 7 months ago
- Wiktionary dump file parser and multilingual data extractor☆1,058Updated this week
- A tokenizer and sentence splitter for German and English web and social media texts.☆150Updated last year
- The Global WordNet Association Collaborative Inter-Lingual Index☆50Updated last year
- The Open English WordNet☆686Updated this week
- A python module for English lemmatization and inflection.☆274Updated 2 years ago
- ☆80Updated 3 weeks ago
- The World Atlas of Language Structures☆72Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆256Updated 3 years ago
- Bitextor generates translation memories from multilingual websites☆299Updated last year
- Offline bilingual dictionaries made using data from Wiktionary☆62Updated 10 years ago
- Aksharamukha Python Library☆55Updated 10 months ago
- Gather modern English word frequencies from all enwiki articles.☆227Updated last year
- Python Finite-State Toolkit☆60Updated this week
- Pipeline to generate the Standardized Project Gutenberg Corpus☆205Updated last year
- A cloud-based, open-source system for writing and publishing dictionaries.☆98Updated last year