mediacloud / feed_seeker
Find rss, atom, xml, and rdf feeds on webpages
☆30Updated last month
Related projects ⓘ
Alternatives and complementary repositories for feed_seeker
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- Tag news stories based on models trained on the NYT corpus.☆40Updated last year
- A financial disclosure data extraction tool.☆13Updated last year
- A Python library for defining rule-based overrides on messy data☆12Updated this week
- Simple tools for summarizing .mbox email archives.☆10Updated 4 years ago
- A simple Web crawler for stackshare.io using scrapy .☆9Updated 5 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated last month
- Visual analytics application for qualitative text analysis☆24Updated last year
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- Ask questions about government data.☆37Updated 5 years ago
- Datasette plugin providing instructions for exporting data to Jupyter or Observable☆12Updated last year
- Datasette plugin for modifying table schemas☆16Updated 2 months ago
- Scrape various open data directories to create an index of what's available out there☆31Updated this week
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- ☆12Updated 5 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- A browser extension providing Open Access bibliographical services☆14Updated last year
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆13Updated last year
- A scraping Master-slave system based on Google App Engine☆11Updated 4 years ago
- Python based Wikidata framework for easy dataframe extraction☆39Updated 11 months ago
- Alignment, a collaborative, system aided, user driven ontology/vocabulary matching and validation platform.☆12Updated 2 years ago
- Save an RSS or ATOM feed to a SQLite database☆47Updated 2 years ago
- A helper library full of URL-related heuristics.☆64Updated last month
- Utilties which support the proccessing of XML based USPTO trademark bulk download files☆29Updated 4 years ago
- Sidewall is a Python library for interacting with the Dimensions search API.☆17Updated 2 months ago
- An open interface to GDELT APIs☆41Updated 11 months ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆16Updated last week
- an app that makes your personalized newsletter based on your bookmarks☆11Updated 7 years ago