mikwielgus / forum-dlLinks
Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
☆100Updated last year
Alternatives and similar repositories for forum-dl
Users that are interested in forum-dl are comparing it to the libraries listed below
Sorting:
- Clean a series of links, resolving redirects and finding Wayback results if page is gone. Originally written to aid with importing from A…☆18Updated 11 months ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆129Updated 3 weeks ago
- A self-hosted bookmark database with full-text page content search☆96Updated 3 months ago
- Reddit archiver☆178Updated last year
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing…☆90Updated last month
- Export userdata from your reddit accounts. Submissions, comments, saved, upvoted contents are supported.☆22Updated 10 months ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆70Updated 6 months ago
- Tool to index and serve HTML files. Powered by Datasette.☆107Updated 3 years ago
- ⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 …☆83Updated last month
- Docker Container for grab-site☆12Updated last year
- Home of the official docker image for ArchiveBox☆53Updated 9 months ago
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page…☆40Updated last year
- Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.☆355Updated 4 months ago
- backup and parse your browser history databases (chrome, firefox, safari, and other chrome/firefox derivatives)☆143Updated last week
- Interact with ArchiveBox to automatically archive all your saved reddit posts and comments.☆17Updated 2 years ago
- The Temboz RSS/Atom feed reader☆81Updated last year
- [mirror] Backup a list of github starred repositories for the specified user.☆141Updated 2 years ago
- Roffline allows you to browse Reddit offline☆81Updated last year
- Self-hostable link database☆120Updated this week
- [moved to codeberg] Archive all your favorite podcasts☆150Updated this week
- A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)☆110Updated last week
- Scripts to build and boot warrior virtual machine containing Docker☆121Updated 5 months ago
- Our minimalist Certbot alternative that uses the Porkbun API to download and install web server SSL certificates☆43Updated 2 years ago
- A server to collect & archive websites that also supports video downloads☆86Updated 2 years ago
- Simple podcast downloader☆37Updated 3 months ago
- Reddit takeout: export your account data as JSON: comments, submissions, upvotes etc. 🦖☆174Updated 2 months ago
- WIP - scripts for analyzing the (in)security of Chrome extensions☆27Updated last year
- Collection of Python code to re-use across Python-based scrapers☆25Updated 4 months ago
- Bash scripts which interact with Internet Archive Wayback Machine's Save Page Now☆132Updated 5 months ago
- Rexit - Liberate your Reddit Chats. This tool will export your reddit chats into a plethora of formats☆30Updated 3 weeks ago