openzim / python-scraperlib
Collection of Python code to re-use across Python-based scrapers
β21Updated last month
Alternatives and similar repositories for python-scraperlib:
Users that are interested in python-scraperlib are comparing it to the libraries listed below
- Create a ZIM file from a Youtube channel/username/playlistβ66Updated last week
- π An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.β15Updated 4 years ago
- Command line tool to convert a file in the WARC format to a file in the ZIM formatβ55Updated 2 weeks ago
- Turns a collection of documents into a browsable ZIM fileβ24Updated last month
- π An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.β53Updated 7 months ago
- This is the HeadQuarters of my digital info. HPI library got me inspired and I'm trying to play with the idea on a smaller scale for myseβ¦β21Updated last year
- Various ZIM command line toolsβ154Updated last month
- Clean a series of links, resolving redirects and finding Wayback results if page is gone. Originally written to aid with importing from Aβ¦β18Updated 6 months ago
- Awesome links related to RSS, ATOM, and Syndication formats.β56Updated 8 months ago
- Bookmarked archived linksβ18Updated this week
- DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by Archβ¦β19Updated last year
- Tools to count the number of public domain and free to distribute movies registered in IMDBβ24Updated 5 years ago
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewingβ¦β73Updated this week
- Official Python package for ArchiveBox, the self-hosted internet archiving solution.β13Updated 6 months ago
- A web scraping suite to efficiently load .epub files onto your Kindle.β23Updated 2 weeks ago
- iFixit to ZIM scraperβ29Updated last week
- A list of things related to software, literature, and other content for π£ Mementoβ97Updated 10 months ago
- ActivityPub server without Javascript, designed for simplicity and accessibility. Includes calendar, news and sharing economy features toβ¦β71Updated this week
- A WebFinger server for Facebook and Twitter.β20Updated 3 years ago
- ArchiveBoxMatic: configure ArchiveBox with the simplicity of a yaml file.β14Updated 4 years ago
- Farm operated by bots to grow and harvest new zim filesβ101Updated last week
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.β28Updated 6 months ago
- Curated list of awesome Mastodon-related stuff!β37Updated last month
- Overview of telecommunication standards and technologies for internet accessβ14Updated 10 months ago
- Run your own X, in a few clicks.β12Updated 5 months ago
- A powerful tool that converts voice recordings into high-quality Anki flashcards using AI-powered transcription and LLM processing, featuβ¦β18Updated 3 months ago
- Free and open source web browserβ33Updated 8 months ago
- Kiwix & openZIM build engineβ98Updated last week
- Your "yellow pages" of Enterprise Free Software Publishers, their products and success casesβ17Updated 9 months ago
- Functional composable pipelines allowing clean separation of the business logic and its implementationβ11Updated 10 months ago