lehinevych / MediaWikiAPILinks
Python wrapper for the MediaWiki API to access and parse data from Wikipedia
☆40Updated 2 months ago
Alternatives and similar repositories for MediaWikiAPI
Users that are interested in MediaWikiAPI are comparing it to the libraries listed below
Sorting:
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/☆184Updated 4 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Alternative robots parser module for Python☆18Updated 3 months ago
- Atom, RSS and JSON feed parser for Python 3☆117Updated 2 years ago
- Custom Python functions for working with SQLite FTS4☆22Updated 2 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Sort-friendly URI Reordering Transform (SURT) python module☆42Updated 10 months ago
- 🌸 Train floret vectors☆18Updated 2 years ago
- Python API for PDF documents☆122Updated 9 months ago
- Python package for converting xml and epubs to text files☆34Updated 4 years ago
- Libzim binding for Python: read/write ZIM files in Python☆87Updated last month
- Commons of stupid, simple Python micro functions. Pull requests very welcome.☆19Updated last month
- Parse numbers written in natural language☆116Updated 7 months ago
- Generate reports for spaCy models.☆29Updated 3 years ago
- Extract knowledge from raw text☆13Updated 3 years ago
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- 🕊️ Radically lightweight command-line interfaces☆106Updated 2 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- 📂 Additional lookup tables and data resources for spaCy☆105Updated this week
- Add website scraping abilities to Datasette☆62Updated 2 years ago
- A Python implementation of Lunr.js 🌖☆195Updated 2 months ago
- A helper library full of URL-related heuristics.☆69Updated 2 months ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆151Updated 4 months ago
- (Deprecated - please use https://github.com/gmarmstrong/python-datamuse) Python wrapper for the Datamuse API☆15Updated 7 years ago
- Extract text from HTML☆135Updated 4 years ago
- Template repository for Python projects☆34Updated last month
- 📑 Python Package to reconstruct the original continuous text from PDFs with language models☆32Updated last year
- A Python API to the Internet Archive Wayback Machine☆73Updated 9 months ago
- spaCy extension for Visual Studio Code☆32Updated 2 months ago