Grabbing all news.
☆61Dec 23, 2019Updated 6 years ago
Alternatives and similar repositories for NewsGrabber
Users that are interested in NewsGrabber are comparing it to the libraries listed below
Sorting:
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Dec 3, 2017Updated 8 years ago
- ArchiveBot, an IRC bot for archiving websites☆408Aug 6, 2025Updated 7 months ago
- Wget-compatible web downloader and crawler.☆603Apr 29, 2024Updated last year
- An HTTP-based warc-to-zip converter☆12Mar 8, 2013Updated 13 years ago
- URLTeam's second generation of URL shortener archiving tools☆81Mar 12, 2026Updated last week
- Archiving GitHub☆11Aug 5, 2025Updated 7 months ago
- Automating description for Web Archives in ArchivesSpace using the Archive-It CDX and Partner Data APIs☆11Aug 10, 2018Updated 7 years ago
- Tools for tracking stories on news homepages☆48Oct 22, 2019Updated 6 years ago
- Nondestructive warc-in-tar to warc conversion☆27Apr 21, 2013Updated 12 years ago
- A tool for working with tweet archives.☆15Jan 1, 2023Updated 3 years ago
- some scripts developed to work with ArchivesSpace API☆11May 5, 2020Updated 5 years ago
- Combinatorially flip bits by brute force until a file is no longer corrupted.☆11Sep 28, 2015Updated 10 years ago
- A Rails engine supporting the discovery of web archives.☆50Jun 13, 2023Updated 2 years ago
- Official Python SDK for interacting with the Knock API☆17Mar 13, 2026Updated last week
- We back up a lot of stuff from around the web; now it's time to back up the Internet Archive, just in case.☆92Jul 13, 2020Updated 5 years ago
- creates a wav file from multiple bin (redump.org format)☆12Apr 19, 2018Updated 7 years ago
- ☆12Jan 18, 2016Updated 10 years ago
- A Bill of Rights and Principles for Learning in the Digital Age☆42Jan 24, 2013Updated 13 years ago
- The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns☆1,559May 23, 2025Updated 9 months ago
- Digital Preservation of HTTP in documentary heritage.☆24May 26, 2023Updated 2 years ago
- Save My News: A personal, permanent clipping service☆29Oct 7, 2023Updated 2 years ago
- Python script to create CDX index files of WARC data☆16Sep 7, 2018Updated 7 years ago
- BitTorrent Data Set☆13Jan 2, 2025Updated last year
- Flutrack platform gathers flu related tweets from the entire world, with searching tag, words that are influenza synonyms and flu symptom…☆13Apr 22, 2019Updated 6 years ago
- Mirrored from https://gitlab.com/compiz/compicc.git☆10Nov 5, 2024Updated last year
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆30Jan 17, 2026Updated 2 months ago
- An interpreter for the Rapira (Рапира) programming language☆19Dec 1, 2020Updated 5 years ago
- Sources and/or disassembly listings to BIOS and firmware☆23Nov 1, 2025Updated 4 months ago
- Tools to browse disk images and file system metadata in a web service☆24Jan 10, 2024Updated 2 years ago
- Parse OCR result files for pagenos, tables of contents, etc.☆14Nov 30, 2011Updated 14 years ago
- Wiki backup and issue tracking for indieweb.org☆27Updated this week
- Ekeko is a tool that helps you save all of your favorited memes, videos and other online resources.☆15Oct 27, 2022Updated 3 years ago
- Trough: Big data, small databases.☆42Jul 25, 2024Updated last year
- ☆16Oct 26, 2022Updated 3 years ago
- A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia☆19Oct 30, 2024Updated last year
- scrapy-extras -- a collection of code samples and modules for the Scrapy framework.☆14Dec 14, 2020Updated 5 years ago
- ☆11Apr 7, 2021Updated 4 years ago
- A Tool To Push Web Resources Into Web Archives☆432Jan 23, 2024Updated 2 years ago
- Gentoo Linux Install Scripts☆10Feb 9, 2015Updated 11 years ago