openzim / python-libzimLinks
Libzim binding for Python: read/write ZIM files in Python
☆95Updated this week
Alternatives and similar repositories for python-libzim
Users that are interested in python-libzim are comparing it to the libraries listed below
Sorting:
- An easy to use offline reader for ZIM files right in your browser!☆82Updated last year
- Various ZIM command line tools☆180Updated last month
- A set of utilities for processing MediaWiki XML dump data.☆60Updated 10 months ago
- Translate HTML using Argos Translate☆54Updated 2 years ago
- Create a ZIM file from a Youtube channel/username/playlist☆83Updated 3 weeks ago
- Python wrapper for the MediaWiki API to access and parse data from Wikipedia☆42Updated last week
- Reference implementation of the ZIM specification☆217Updated this week
- Collection of Python code to re-use across Python-based scrapers☆24Updated last week
- Training scripts for Argos Translate☆148Updated 3 weeks ago
- fasttext with wheels and no external dependency, but only the predict method (<1MB)☆18Updated last year
- Atom, RSS and JSON feed parser for Python 3☆117Updated 3 years ago
- Kiwix & openZIM build engine☆109Updated last week
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆75Updated 8 months ago
- search interface for scholarly works☆85Updated last year
- Loadable spellfix1 extension for sqlite as python package☆26Updated last year
- Farm operated by bots to grow and harvest new zim files☆171Updated this week
- Scraper for downloading the entire ebooks repository of project Gutenberg☆154Updated this week
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆53Updated 2 weeks ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆56Updated 4 years ago
- modulegraph determines a dependency graph between Python modules primarily by bytecode analysis for import statements. modulegraph …☆46Updated 2 weeks ago
- python library to validate, clean, transform and get metadata of ISBN strings (for devs).☆268Updated last year
- A Python library to parse MediaWiki WikiText☆317Updated 6 months ago
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆226Updated this week
- Python client library to interface with the MediaWiki API☆338Updated last week
- An experimental Python parser for MediaWiki syntax with a focus on extensibility and comprehensibility☆60Updated 3 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆108Updated 2 weeks ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆142Updated last month
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆30Updated 4 years ago
- MediaWiki scraper: all your wiki articles in one highly compressed ZIM file☆412Updated last week
- Python API for PDF documents☆125Updated last year