nla / httrack2warc
Converts HTTrack crawls to WARC files
☆32Updated 9 months ago
Alternatives and similar repositories for httrack2warc:
Users that are interested in httrack2warc are comparing it to the libraries listed below
- Official Python package for ArchiveBox, the self-hosted internet archiving solution.☆13Updated 7 months ago
- Home of the official apt/deb package for Ubuntu/Debian-based systems.☆17Updated 7 months ago
- A server to collect & archive websites that also supports video downloads☆86Updated 2 years ago
- ArchiveBoxMatic: configure ArchiveBox with the simplicity of a yaml file.☆14Updated 4 years ago
- ☆11Updated 3 years ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆57Updated last month
- PRE & NFO database and notification service for warez scene releases. This repository contains the frontend code written in Next.js and C…☆32Updated 3 years ago
- Recover lost websites from the Web Infrastructure☆89Updated 4 years ago
- simple script to convert web resources to a single warc file☆21Updated last year
- 🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.☆54Updated 8 months ago
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆54Updated 2 months ago
- Fake Seeder for Torrent☆11Updated 5 years ago
- The ArchiveWeb.page Site☆31Updated 5 months ago
- 🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.☆15Updated 4 years ago
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.☆28Updated 7 months ago
- Archiving public telegram messages.☆12Updated 4 months ago
- a gui for TRID ( http://mark0.net/soft-trid-e.html )☆18Updated 8 years ago
- Copy all Google Fonts to a folder☆10Updated 6 years ago
- Userscript to strip click tracking junk from Google search results URLs☆15Updated 5 years ago
- Browser userscript to clean up hyperlink redirections and link shims☆19Updated 3 years ago
- Deterministic Usenet Vault☆32Updated 5 years ago
- simple Perl script for uploading files to Internet Archive through its S3-like interface☆27Updated 7 years ago
- DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by Arch…☆19Updated last year
- Strip advertisements from downloaded YouTube videos☆59Updated 3 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆40Updated last week
- Discord archive tool☆14Updated 2 years ago
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing…☆77Updated last month
- The (new) discovery backend for https://odcrawler.xyz☆29Updated 2 years ago
- plugin manager for yt-dlp which enables releases of extractors as separate python package☆50Updated last week
- A browser extension to search for magnet links from The Pirate Bay directly from a popup toolbar☆19Updated 6 months ago