openzim / python-libzim
Libzim binding for Python: read/write ZIM files in Python
☆85Updated 3 weeks ago
Alternatives and similar repositories for python-libzim
Users that are interested in python-libzim are comparing it to the libraries listed below
Sorting:
- Create a ZIM file from a Youtube channel/username/playlist☆66Updated 3 weeks ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated last year
- Farm operated by bots to grow and harvest new zim files☆104Updated 3 weeks ago
- A set of utilities for processing MediaWiki XML dump data.☆53Updated 3 months ago
- MediaWiki scraper: all your wiki articles in one highly compressed ZIM file☆354Updated this week
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆40Updated last week
- typed python RSS parsing module built using xmltodict and pydantic☆45Updated 7 months ago
- A polite and user-friendly downloader for Common Crawl data☆43Updated last week
- A Python binding of SQLite Full Text Search Tokenizer☆48Updated 2 weeks ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Updated last year
- Atom, RSS and JSON feed parser for Python 3☆117Updated 2 years ago
- An easy to use offline reader for ZIM files right in your browser!☆80Updated last year
- Python wrapper for the MediaWiki API to access and parse data from Wikipedia☆39Updated last month
- A modern CSS selector implementation for BeautifulSoup☆236Updated last week
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆203Updated last month
- CSV on the web☆40Updated 2 months ago
- An experimental Python parser for MediaWiki syntax with a focus on extensibility and comprehensibility☆61Updated 2 years ago
- A Memento Client Library in Python☆26Updated 7 years ago
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆27Updated 4 years ago
- Simple bencode parser (for Python 2, Python 3 and PyPy)☆54Updated last year
- Standalone version of Django's feedgenerator module☆52Updated last year
- A Python API to the Internet Archive Wayback Machine☆71Updated 9 months ago
- Sort-friendly URI Reordering Transform (SURT) python module☆42Updated 9 months ago
- ☆43Updated last year
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆26Updated 9 months ago
- Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.☆54Updated 4 months ago
- A Python API for the GNU Privacy Guard (GnuPG). Encrypt, decrypt, sign and verify your data using Python! N.B. This repository has been m…☆122Updated 2 months ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆63Updated 8 months ago
- Updates Wikidata entries using metadata from github☆44Updated last month
- Sickle: OAI-PMH for Humans☆113Updated last year