ArchiveTeam / NewsGrabberLinks
Grabbing all news.
☆62Updated 5 years ago
Alternatives and similar repositories for NewsGrabber
Users that are interested in NewsGrabber are comparing it to the libraries listed below
Sorting:
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Serving content from a WARC☆61Updated 12 years ago
- track changes to the news, where news is anything with an RSS feed☆178Updated 4 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated last year
- WARC and ARC indexing and discovery tools.☆124Updated 2 months ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆160Updated 4 years ago
- A commandline tool and Python library for archiving data from Facebook using the Graph API.☆78Updated 7 years ago
- ☆36Updated last year
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- 📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity☆95Updated 6 years ago
- Tool and library for handling Web ARChive (WARC) files.☆159Updated 7 months ago
- Social Feed Manager user interface application.☆155Updated 11 months ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆46Updated 7 years ago
- Converts WARC files to static HTML☆44Updated 11 months ago
- NOTE: This project is no longer being actively developed.. Check out https://replayweb.page / https://github.com/webrecorder/replayweb.pa…☆201Updated 4 months ago
- The Bibliotheca Anonoma's own Bing Cache and Google Cache scraper scripts. Unlike most of the other ones you've seen, these actually work…☆28Updated 7 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- A collection of tools for archiving and analysing the internet.☆77Updated 2 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆61Updated 2 months ago
- 🗄 Bot powering the @LinkArchiver Twitter tool to send tweeted URLs to the Wayback Machine☆46Updated 7 years ago
- Trough: Big data, small databases.☆42Updated 10 months ago
- Python tools for processing data from the Catalog of Copyright Entries☆37Updated 5 years ago
- A list of things related to software, literature, and other content for 🕣 Memento☆98Updated last year
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆13Updated 3 months ago
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆179Updated 7 months ago
- A social media open post web archiving tool☆26Updated 3 weeks ago
- A Memento Aggregator CLI and Server in Go☆64Updated 2 months ago
- A list of tools related to W(eb)ARC(hive)☆61Updated 10 years ago
- ☆15Updated 6 years ago
- Python library for reading and writing warc files☆240Updated 3 years ago