chrisstiles / PublishDateBotLinks
A reddit bot that finds original publish dates on linked articles.
☆10Updated last year
Alternatives and similar repositories for PublishDateBot
Users that are interested in PublishDateBot are comparing it to the libraries listed below
Sorting:
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆189Updated 3 weeks ago
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 6 years ago
- A browser extension to share data about your social feed with researchers and journalists to increase transparency.☆86Updated 2 years ago
- Grabbing all news.☆61Updated 6 years ago
- Estimating the age of web resources☆97Updated 8 months ago
- Parse government documents into well formed JSON☆75Updated this week
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆14Updated 11 months ago
- Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more☆224Updated 2 years ago
- The subreddit archiver☆178Updated 2 years ago
- track changes to the news, where news is anything with an RSS feed☆182Updated 5 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆18Updated 2 years ago
- Media Bias Fact Check extension☆43Updated this week
- Google News RSS as OPML☆25Updated 7 years ago
- A list of over 5000 US news domains and their social media accounts☆49Updated 2 years ago
- A helper library full of URL-related heuristics.☆73Updated this week
- Now included in rigour☆152Updated 2 months ago
- Github mirror - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)☆37Updated last year
- Wayback Machine API interface & a command-line tool☆561Updated last year
- A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.☆122Updated last year
- ☆76Updated this week
- Streaming WARC/ARC library for fast web archive IO☆446Updated last year
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆198Updated 2 weeks ago
- An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts☆18Updated 7 years ago
- Ultimate Website Sitemap Parser☆242Updated 2 weeks ago
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆220Updated 2 years ago
- Code and data belonging to our CSCW 2019 paper: "Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites".☆136Updated 6 years ago
- Python Pushshift.io API Wrapper (for comment/submission search)☆363Updated 2 years ago
- Python Pushshift.io API Wrapper (for comment/submission search)☆14Updated 4 years ago
- ☆44Updated 4 years ago