ArchiveTeam / grab-siteLinks
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
☆1,505Updated 2 months ago
Alternatives and similar repositories for grab-site
Users that are interested in grab-site are comparing it to the libraries listed below
Sorting:
- Wget-compatible web downloader and crawler.☆589Updated last year
- Collect and revisit web pages.☆1,512Updated 7 months ago
- Core Python Web Archiving Toolkit for replay and recording of web archives☆1,545Updated this week
- A Python and Command-Line Interface to Archive.org☆1,757Updated this week
- Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)☆448Updated 4 years ago
- brozzler - distributed browser-based web crawler☆734Updated last week
- Use yt-dlp to download video/metadata and upload to the Internet Archive.☆458Updated last month
- WARC writing MITM HTTP/S proxy☆417Updated this week
- Serverless replay of web archives directly in the browser☆827Updated 3 weeks ago
- ArchiveBot, an IRC bot for archiving websites☆395Updated 2 weeks ago
- List of data-hoarding related tools☆1,205Updated last year
- The OpenWayback Development☆502Updated last year
- Starting point for archiving entire YouTube channels using yt-dlp (originally youtube-dl)☆505Updated 2 years ago
- Chrome extension to "Create WARC files from any webpage"☆222Updated last year
- Self-Hosted Bookmark And Archive Manager☆1,827Updated last year
- Lightning-fast file system indexer and search tool☆1,118Updated last month
- Indexes open directories☆1,249Updated last month
- A curated list of awesome tools for website diffing and change monitoring.☆512Updated 3 years ago
- A Dockerfile for the ArchiveTeam Warrior☆400Updated 9 months ago
- find duplicate files utility☆1,133Updated 5 months ago
- IA's public Wayback Machine (moved from SourceForge)☆801Updated last year
- 💾 dn - offline full-text search and archiving for your Chromium-based browser.☆3,849Updated 3 months ago
- Webrecorder Desktop App!☆205Updated 4 years ago
- Scrapes Reddit to download media of your choice.☆1,132Updated last year
- Extremely fast tool to remove duplicates and other lint from your filesystem☆2,141Updated 3 months ago
- The ultimate collection of scripts for YouTube-DL.☆2,483Updated 2 months ago
- CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)☆934Updated 2 months ago
- Download an entire website from the Wayback Machine.☆5,660Updated last year
- Wayback Machine API interface & a command-line tool☆544Updated last year
- archive reddit data as offline friendly web pages☆175Updated 5 years ago