mediacloud/date_guesser

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mediacloud/date_guesser)

mediacloud / date_guesser

A library to extract a publication date from a web page, along with a measure of the accuracy.

☆41

Alternatives and similar repositories for date_guesser

Users that are interested in date_guesser are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mediacloud / feed_seeker
View on GitHub
Find rss, atom, xml, and rdf feeds on webpages
☆31Nov 6, 2025Updated 8 months ago
mediacloud / web-tools
View on GitHub
The shared repository for Media Cloud web apps (Explorer, Source Manager, Topic Mapper)
☆65Dec 14, 2023Updated 2 years ago
mediacloud / nyt-news-labeler
View on GitHub
Tag news stories based on models trained on the NYT corpus.
☆41Mar 1, 2023Updated 3 years ago
kmunger / Topic_Models
View on GitHub
Presentation for the NYU Data Lab December 2015
☆14Dec 2, 2015Updated 10 years ago
Webhose / article-date-extractor
View on GitHub
Automatically extracts and normalizes an online article or blog post publication date
☆120Aug 10, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mkearney / tfse
View on GitHub
🛠 Useful R functions for various things
☆18Jul 4, 2019Updated 7 years ago
lukasgebhard / Political-News-Filter
View on GitHub
A classifier that distinguishes political from non-political news articles.
☆31Jul 30, 2023Updated 2 years ago
Koshqua / scrapio
View on GitHub
Simple and easy-to-use scraper and crawler in Go.
☆12May 4, 2020Updated 6 years ago
fhamborg / NewsBirdServer
View on GitHub
Matrix-based News Aggregation to Explore Media Bias
☆20Jun 26, 2018Updated 8 years ago
TeamHG-Memex / extract-html-diff
View on GitHub
extract difference between two html pages
☆33Apr 8, 2026Updated 3 months ago
jandix / mediacloudr
View on GitHub
API Wrapper for the mediacloud.org API
☆16Aug 20, 2019Updated 6 years ago
inferlink / landmark-extractor
View on GitHub
☆11May 31, 2019Updated 7 years ago
mediacloud / backend
View on GitHub
Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online me…
☆289Nov 20, 2023Updated 2 years ago
ercexpo / reddit_incivility
View on GitHub
Classification of incivility in Reddit posts
☆19Nov 19, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
dr-JT / semoutput
View on GitHub
Create nice looking output for CFA and SEM analyses using lavaan and semPlot packages
☆22Jun 3, 2026Updated last month
GateNLP / ultimate-sitemap-parser
View on GitHub
Ultimate Website Sitemap Parser
☆255Jun 16, 2026Updated last month
zhannar / Media-Bias-NLP-Clustering
View on GitHub
Revealing the Omitted - An Exploration of Media Bias in the news coverage of Obamacare. Employs Selenium and BeautifulSoup to scrape over…
☆17Feb 9, 2019Updated 7 years ago
triptych / godot_reader_tutorial
View on GitHub
☆24Jun 20, 2022Updated 4 years ago
gwu-libraries / sfm-ui
View on GitHub
Social Feed Manager user interface application.
☆157Jun 25, 2024Updated 2 years ago
richarddmorey / encrypt_data_example
View on GitHub
Shows how to encrypt data held in public space
☆11Aug 11, 2017Updated 8 years ago
fhamborg / NewsMTSC
View on GitHub
Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k se…
☆156Jul 18, 2025Updated last year
peterwaksman / Narwhal
View on GitHub
Narwhal is a keyword and KEY NARRATIVE manager that creates language-aware classes. Because Narhwal does not use NLP it avoids complexity…
☆12Oct 16, 2018Updated 7 years ago
zahlabut / LogTool
View on GitHub
Openstack logs - export errors and other usefully modes
☆14Jun 3, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
scrapinghub / shublang
View on GitHub
Pluggable DSL that uses pipes to perform a series of linear transformations to extract data
☆16Jul 9, 2024Updated 2 years ago
xaiguy / chippy
View on GitHub
☆13Feb 26, 2023Updated 3 years ago
flatsiedatsie / webthings-network-presence-detection
View on GitHub
Devices on the local network can be added as a 'thing' in the Candle Controller / WebThings Gateway. Automations can then respond to thei…
☆12Feb 9, 2026Updated 5 months ago
binder-examples / r_with_python
View on GitHub
Minimal working example for a binder with both R and Python Jupyter and RMarkdown notebooks
☆31Mar 26, 2019Updated 7 years ago
goutomroy / digging_asyncio
View on GitHub
python 3.7 asyncio tutorial.
☆14Aug 24, 2019Updated 6 years ago
crawlbase / proxycrawl-python
View on GitHub
ProxyCrawl Python library for scraping and crawling
☆58Jul 4, 2023Updated 3 years ago
jplusplus / skolstatistik
View on GitHub
A collection of datasets from Skolverket
☆11Sep 1, 2020Updated 5 years ago
gunesmes / python-selenium-behave-page-object-docker
View on GitHub
Run your Selenium BDD (Behaviour Driven Development) test cases in Docker. Python, Selenium, Behave, Chrome, Docker. Page object mode (PO…
☆12Nov 5, 2024Updated last year
weedle1912 / detection-and-tracking
View on GitHub
Autonomous object tracking: A combination of a detector and a tracker
☆14May 31, 2018Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
AILab-FOI / B.A.R.I.C.A.
View on GitHub
B.A.R.I.C.A. is an acronym standing for "Beautiful ARtificial Intelligence Cognitive Agent"
☆11Apr 13, 2026Updated 3 months ago
litrl / litrl_code
View on GitHub
litrl browser and detectors
☆10Oct 5, 2023Updated 2 years ago
sunholo-data / sunholo-py
View on GitHub
A python library to enable GenAI and LLMOps within Google Cloud Platform
☆17Mar 12, 2026Updated 4 months ago
Tyrone-Zhao / crawlerUtils
View on GitHub
Utils for programming web crawler
☆11May 16, 2019Updated 7 years ago
uptick / react-object-list
View on GitHub
Neat table/list views with filtering and pagination support; powered by React.
☆13Jan 25, 2023Updated 3 years ago
Anakeyn / Bert_Squad_SEO
View on GitHub
This tool provide a "Bert Score" for first max 30 pages responding to a question in Google
☆13Feb 10, 2020Updated 6 years ago
neurobin / phantomjspy
View on GitHub
Python wrapper for phantomjs
☆15May 28, 2021Updated 5 years ago