openzim / zimfarmLinks
Farm operated by bots to grow and harvest new zim files
☆118Updated this week
Alternatives and similar repositories for zimfarm
Users that are interested in zimfarm are comparing it to the libraries listed below
Sorting:
- MediaWiki scraper: all your wiki articles in one highly compressed ZIM file☆400Updated last week
- Various ZIM command line tools☆177Updated 2 weeks ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆71Updated 7 months ago
- Kiwix & openZIM build engine☆107Updated last week
- Create a ZIM file from a Youtube channel/username/playlist☆79Updated last week
- Nondestructive warc-in-tar to warc conversion☆27Updated 12 years ago
- creates ZIM files for Kiwix from arbitrary websites with wget and some nifty tricks (doesn't need ServiceWorkers)☆98Updated 4 months ago
- A list of things related to software, literature, and other content for 🕣 Memento☆102Updated last year
- We back up a lot of stuff from around the web; now it's time to back up the Internet Archive, just in case.☆92Updated 5 years ago
- Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!☆61Updated 3 months ago
- Reference implementation of the ZIM specification☆210Updated 2 weeks ago
- A command line tool to archive a git repository from GitHub to the Internet Archive.☆91Updated 4 years ago
- 🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.☆59Updated last year
- ☆89Updated 6 months ago
- Collection of Python code to re-use across Python-based scrapers☆24Updated 3 weeks ago
- A python CLI to manage torrents for Library Genesis data.☆71Updated 9 months ago
- Saves proxied HTTP traffic to a WARC file.☆30Updated 12 years ago
- [ARCHIVED] Kiwix Hotspot Image Creator (Desktop) for Windows/macOS/Linux☆71Updated last year
- Chrome extension to "Create WARC files from any webpage"☆224Updated last year
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆54Updated 7 years ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆130Updated 2 months ago
- ☆17Updated this week
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆50Updated 2 weeks ago
- Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more …☆344Updated last week
- A web archives reader☆114Updated last month
- Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC☆101Updated last year
- The OpenWayback Development☆506Updated last year
- A dockerized, queued high fidelity web archiver based on Squidwarc☆61Updated last year
- Archive.org OPDS Bookserver - A standard for digital book distribution☆130Updated 6 years ago
- Kiwix JS Offline Browser implemented as a Progressive Web App (PWA), and packaged as Electron, NWJS and UWP apps for Windows, Linux and m…☆224Updated this week