bitdruid / python-wayback-machine-downloaderLinks
Query and download archive.org as simple as possible.
☆98Updated this week
Alternatives and similar repositories for python-wayback-machine-downloader
Users that are interested in python-wayback-machine-downloader are comparing it to the libraries listed below
Sorting:
- Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC☆105Updated last year
- Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.☆101Updated 3 years ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆130Updated 3 months ago
- Use yt-dlp to download video/metadata and upload to the Internet Archive.☆468Updated 2 weeks ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆75Updated 8 months ago
- Bash scripts which interact with Internet Archive Wayback Machine's Save Page Now☆133Updated 7 months ago
- Tool and library for handling Web ARChive (WARC) files.☆165Updated last year
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆30Updated 4 years ago
- Itch.io game downloader with website, game jam, collection and library support☆143Updated 7 months ago
- Simultaneous, resumable and hash-verified downloads from Internet Archive (archive.org)☆170Updated last year
- Download an entire website from the Wayback Machine.☆254Updated last year
- A collection of tools for archiving and analysing the internet.☆78Updated 3 years ago
- Discord archive tool☆19Updated 3 years ago
- Scripts to build and boot warrior virtual machine containing Docker☆122Updated 7 months ago
- Reddit archiver☆181Updated last year
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆167Updated 3 months ago
- Download an entire website from the Wayback Machine.☆248Updated 2 weeks ago
- ☆54Updated last year
- go-ia is a command-line interface for interacting with archive.org written in Go.☆62Updated 4 years ago
- Awesome list dedicated to digital and data preservation tools, sources, services and so on.☆28Updated 3 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆52Updated this week
- Download YouTube comments from numerous videos, playlists, and channels for archiving, general search, and showing activity.☆297Updated 4 months ago
- A Python Reddit API Wrapper (PRAW) script to download all of the accessible wiki pages of a Reddit subreddit☆51Updated last year
- Scrape Twitter API without authentication using Nitter.☆65Updated 3 years ago
- NOTE: This project is no longer being actively developed.. Check out https://replayweb.page / https://github.com/webrecorder/replayweb.pa…☆200Updated 10 months ago
- automatic and extensive scraper for forums☆36Updated 2 months ago
- Wayback Machine API interface & a command-line tool☆553Updated last year
- Rust program for extracting most URLs from Discord scrapes. Works with Discord History Tracker, discard2, and DiscordChatExporter.☆20Updated 11 months ago
- Archived tweets from the Wayback Machine☆151Updated 6 months ago
- 🍨 High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.☆180Updated 2 months ago