mikwielgus / forum-dlLinks
Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
☆95Updated last year
Alternatives and similar repositories for forum-dl
Users that are interested in forum-dl are comparing it to the libraries listed below
Sorting:
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆125Updated 7 months ago
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing…☆85Updated 2 weeks ago
- Reddit archiver☆175Updated last year
- A self-hosted bookmark database with full-text page content search☆94Updated 2 months ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆64Updated 4 months ago
- backup and parse your browser history databases (chrome, firefox, safari, and other chrome/firefox derivatives)☆142Updated 8 months ago
- Bash scripts which interact with Internet Archive Wayback Machine's Save Page Now☆131Updated 4 months ago
- Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.☆339Updated 3 months ago
- Docker Container for grab-site☆12Updated 11 months ago
- Clean a series of links, resolving redirects and finding Wayback results if page is gone. Originally written to aid with importing from A…☆18Updated 10 months ago
- Rexit - Liberate your Reddit Chats. This tool will export your reddit chats into a plethora of formats☆28Updated this week
- Home of the official docker image for ArchiveBox☆52Updated 7 months ago
- ☆39Updated 2 years ago
- Export userdata from your reddit accounts. Submissions, comments, saved, upvoted contents are supported.☆22Updated 9 months ago
- Creates a complete full text historical archive for an RSS or ATOM feed.☆123Updated last week
- A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)☆102Updated last week
- Self-hostable link database☆105Updated this week
- Interact with ArchiveBox to automatically archive all your saved reddit posts and comments.☆17Updated 2 years ago
- 🍨 High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.☆166Updated last month
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page…☆40Updated 10 months ago
- Reddit takeout: export your account data as JSON: comments, submissions, upvotes etc. 🦖☆173Updated 2 weeks ago
- Tool to index and serve HTML files. Powered by Datasette.☆104Updated 3 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆44Updated last week
- A modified version of searx (the privacy-respecting metasearch engine) to only search an allowlist of sites, to build functionality simil…☆19Updated 3 years ago
- 🦛 scrapes websites and generates rss feeds☆53Updated 5 months ago
- Tube Archivist Companion for your Browser☆203Updated 4 months ago
- [mirror] Backup a list of github starred repositories for the specified user.☆139Updated 2 years ago
- Reverse Incremental Rclone Backups☆15Updated 2 years ago
- ⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 …☆78Updated 7 months ago
- The Toolkit API, app, and browser extension. Start preserving now.☆47Updated last month