nla / httrack2warcLinks
Converts HTTrack crawls to WARC files
β33Updated last year
Alternatives and similar repositories for httrack2warc
Users that are interested in httrack2warc are comparing it to the libraries listed below
Sorting:
- π An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.β59Updated last year
- ArchiveBoxMatic: configure ArchiveBox with the simplicity of a yaml file.β14Updated 4 years ago
- Archiving public telegram messages.β15Updated 2 months ago
- π An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.β15Updated 5 years ago
- Official Python package for ArchiveBox, the self-hosted internet archiving solution.β13Updated last year
- β11Updated 3 years ago
- Recover lost websites from the Web Infrastructureβ89Updated 2 months ago
- A server to collect & archive websites that also supports video downloadsβ86Updated 2 years ago
- [mirror] Backup a list of github starred repositories for the specified user.β141Updated 2 years ago
- URLTeam's second generation of URL shortener archiving toolsβ79Updated 2 months ago
- Grabbing everything from reddit.β61Updated last year
- The (new) discovery backend for https://odcrawler.xyzβ36Updated 2 years ago
- Bash scripts which interact with Internet Archive Wayback Machine's Save Page Nowβ133Updated 7 months ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.β129Updated 2 months ago
- Scrape https://unlistedvideos.com/β15Updated 4 years ago
- a gui for TRID ( http://mark0.net/soft-trid-e.html )β21Updated 9 years ago
- A youtube-dl extension with pluggable extractorsβ53Updated 6 months ago
- Archiving URLs (outlinks) from a variety of sources.β24Updated last week
- wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improvedβ30Updated last month
- Server and bookmarklet to download files via youtube-dl directly from your browser. Cross platform single binary installation, web browseβ¦β78Updated 5 months ago
- Fake Seeder for Torrentβ12Updated 5 years ago
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.β28Updated last year
- Home of the official apt/deb package for Ubuntu/Debian-based systems.β17Updated last year
- Generate Bookmarks export file (html) of the github user's starred reposβ71Updated 8 years ago
- A command line tool to archive a git repository from GitHub to the Internet Archive.β91Updated 4 years ago
- Mozilla LZ4 File Decryption and Mining Toolsβ38Updated 5 months ago
- END OF THE WORLDβ11Updated 5 years ago
- Strip advertisements from downloaded YouTube videosβ60Updated 4 years ago
- π Reverse search an image on every search engineβ43Updated 5 years ago
- Command line tool to convert a file in the WARC format to a file in the ZIM formatβ72Updated 7 months ago