nla / httrack2warcLinks
Converts HTTrack crawls to WARC files
☆33Updated last year
Alternatives and similar repositories for httrack2warc
Users that are interested in httrack2warc are comparing it to the libraries listed below
Sorting:
- ArchiveBoxMatic: configure ArchiveBox with the simplicity of a yaml file.☆14Updated 4 years ago
- Official Python package for ArchiveBox, the self-hosted internet archiving solution.☆13Updated 11 months ago
- Archiving public telegram messages.☆14Updated 3 weeks ago
- 🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.☆58Updated last year
- ☆11Updated 3 years ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆71Updated 6 months ago
- 🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.☆15Updated 4 years ago
- A list of things related to software, literature, and other content for 🕣 Memento☆99Updated last year
- A server to collect & archive websites that also supports video downloads☆86Updated 2 years ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆130Updated last month
- Recover lost websites from the Web Infrastructure☆89Updated last month
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.☆29Updated 11 months ago
- wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improved☆30Updated this week
- DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by Arch…☆19Updated last year
- Home of the official apt/deb package for Ubuntu/Debian-based systems.☆17Updated 11 months ago
- END OF THE WORLD☆11Updated 5 years ago
- [mirror] Backup a list of github starred repositories for the specified user.☆142Updated 2 years ago
- Mozilla LZ4 File Decryption and Mining Tools☆37Updated 4 months ago
- Server and bookmarklet to download files via youtube-dl directly from your browser. Cross platform single binary installation, web browse…☆78Updated 4 months ago
- A youtube-dl extension with pluggable extractors☆52Updated 5 months ago
- The ArchiveWeb.page Site☆31Updated 9 months ago
- Grabbing everything from reddit.☆62Updated last year
- URLTeam's second generation of URL shortener archiving tools☆80Updated 3 weeks ago
- Scrape https://unlistedvideos.com/☆15Updated 4 years ago
- Bash scripts which interact with Internet Archive Wayback Machine's Save Page Now☆133Updated 5 months ago
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page…☆40Updated last year
- Archiving URLs (outlinks) from a variety of sources.☆23Updated this week
- The (new) discovery backend for https://odcrawler.xyz☆32Updated 2 years ago
- Userscript to strip click tracking junk from Google search results URLs☆15Updated 5 years ago
- Adblock/AdGuard filters for various self-empowerment☆20Updated 9 months ago