openzim / python-libzimLinks
Libzim binding for Python: read/write ZIM files in Python
☆92Updated 2 weeks ago
Alternatives and similar repositories for python-libzim
Users that are interested in python-libzim are comparing it to the libraries listed below
Sorting:
- Create a ZIM file from a Youtube channel/username/playlist☆78Updated last week
- Farm operated by bots to grow and harvest new zim files☆114Updated this week
- Various ZIM command line tools☆174Updated 2 months ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆71Updated 6 months ago
- Collection of Python code to re-use across Python-based scrapers☆25Updated 4 months ago
- A set of utilities for processing MediaWiki XML dump data.☆57Updated 7 months ago
- Kiwix & openZIM build engine☆105Updated this week
- An easy to use offline reader for ZIM files right in your browser!☆79Updated last year
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆50Updated this week
- Atom, RSS and JSON feed parser for Python 3☆117Updated 2 years ago
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆30Updated 4 years ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated last year
- fasttext with wheels and no external dependency, but only the predict method (<1MB)☆17Updated 9 months ago
- A polite and user-friendly downloader for Common Crawl data☆57Updated last month
- Standalone version of Django's feedgenerator module☆54Updated last month
- Python API for PDF documents☆124Updated last year
- MediaWiki scraper: all your wiki articles in one highly compressed ZIM file☆393Updated this week
- SQLite3 DB-API 2.0 driver from Python 3, packaged separately, with improvements☆220Updated 5 months ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆51Updated last month
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆221Updated 2 weeks ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆107Updated last week
- Training scripts for Argos Translate☆140Updated last week
- An experimental Python parser for MediaWiki syntax with a focus on extensibility and comprehensibility☆61Updated 3 years ago
- A Python library to parse MediaWiki WikiText☆313Updated 4 months ago
- Python wrapper for the MediaWiki API to access and parse data from Wikipedia☆42Updated 3 weeks ago
- Fast PDF generation and compression. Deals with millions of pages daily.☆124Updated last week
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆43Updated last week
- Scraper for downloading the entire ebooks repository of project Gutenberg☆152Updated last month
- ZODB Client-Server framework☆45Updated last month