mikwielgus / forum-dl
Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
☆78Updated 6 months ago
Alternatives and similar repositories for forum-dl:
Users that are interested in forum-dl are comparing it to the libraries listed below
- A self-hosted bookmark database with full-text page content search☆84Updated last year
- Selenium Open Source Search Engine & crawler☆54Updated this week
- Creates a complete full text historical archive for an RSS or ATOM feed.☆112Updated this week
- Exports all accessible reddit comments for an account using pushshift☆11Updated 2 months ago
- ☆11Updated 2 years ago
- Home of the official docker image for ArchiveBox☆50Updated last month
- The Temboz RSS/Atom feed reader☆82Updated last year
- DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by Arch…☆18Updated 11 months ago
- Command line tool written in Go for sorting and categorizing personal files like screenshots, recordings, logs and more.☆19Updated 2 years ago
- searchmysite.net is an open source search engine and search as a service☆79Updated 2 months ago
- Simple podcast downloader☆36Updated 7 months ago
- Warrior virtual machine appliance (version 4)☆20Updated 4 months ago
- Downloads content from reddit☆19Updated last year
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆48Updated last week
- WIP - scripts for analyzing the (in)security of Chrome extensions☆26Updated 9 months ago
- A gui to control and manage snapcast written in python☆11Updated 2 months ago
- ⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 …☆55Updated 3 weeks ago
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page…☆38Updated 4 months ago
- 🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.☆14Updated 4 years ago
- A privacy focused, self-hosted podcatcher.☆36Updated last month
- A set of utilities to help bring content and users from legacy social media networks into the fediverse☆26Updated 3 months ago
- Reddit archiver☆160Updated 11 months ago
- Convert an online sitemap to Atom, RSS and JSON feeds☆61Updated last year
- Reddit takeout: export your account data as JSON: comments, submissions, upvotes etc. 🦖☆166Updated 2 months ago
- Mount SMB from Tailscale Hosts☆10Updated last year
- Get news from foreign RSS feeds translated, summarized, and spoken to you on demand.☆55Updated 3 weeks ago
- 🦛 scrapes websites and generates rss feeds☆54Updated last month
- Fetch all your bookmarked tweets and make them accessible through a webinterface.☆29Updated last year
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing…☆55Updated this week