ArchiveTeam / grab-siteLinks
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
☆1,506Updated 2 months ago
Alternatives and similar repositories for grab-site
Users that are interested in grab-site are comparing it to the libraries listed below
Sorting:
- Wget-compatible web downloader and crawler.☆587Updated last year
- Collect and revisit web pages.☆1,508Updated 6 months ago
- An Awesome List for getting started with web archiving☆2,321Updated 3 months ago
- A Python and Command-Line Interface to Archive.org☆1,741Updated 2 weeks ago
- Serverless replay of web archives directly in the browser☆816Updated last week
- Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2025, WikiTeam has preserved more th…☆777Updated 4 months ago
- WARC writing MITM HTTP/S proxy☆417Updated last week
- Use yt-dlp to download video/metadata and upload to the Internet Archive.☆458Updated 2 weeks ago
- List of data-hoarding related tools☆1,198Updated last year
- A Tool To Push Web Resources Into Web Archives☆420Updated last year
- Chrome extension to "Create WARC files from any webpage"☆222Updated last year
- Starting point for archiving entire YouTube channels using yt-dlp (originally youtube-dl)☆501Updated 2 years ago
- Self-Hosted Bookmark And Archive Manager☆1,826Updated last year
- A curated list of awesome tools for website diffing and change monitoring.☆512Updated 2 years ago
- Utilities for dealing with Tumblr blogs, Tumblr backup☆686Updated 5 months ago
- Indexes open directories☆1,240Updated last month
- Offline Internet Archive project☆289Updated last year
- Webrecorder Desktop App!☆205Updated 4 years ago
- Shell scripts for organizing and managing ebook collections☆749Updated 5 years ago
- Extremely fast tool to remove duplicates and other lint from your filesystem☆2,131Updated 2 months ago
- archive reddit data as offline friendly web pages☆175Updated 5 years ago
- Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matc…☆811Updated 5 months ago
- The ultimate collection of scripts for YouTube-DL.☆2,466Updated last month
- Web Extension to save a faithful copy of an entire web page in a self-extracting ZIP file☆1,892Updated last week
- 💾 dn - offline full-text search and archiving for your Chromium-based browser.☆3,850Updated 2 months ago
- The personal, minimalist, super-fast, database free, bookmarking service - community repo☆3,682Updated last month
- Scrapes Reddit to download media of your choice.☆1,131Updated last year
- A very low memory-footprint, self hosted API-only torrent search engine. Sonarr + Radarr Compatible, native support for Linux, Mac and Wi…☆801Updated last year
- CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)☆912Updated 2 months ago
- a standard filetree for /r/datacurator [ and r/datahoarder ]☆1,573Updated 11 months ago