mediacloud / feed_seeker
Find rss, atom, xml, and rdf feeds on webpages
☆30Updated 5 months ago
Alternatives and similar repositories for feed_seeker:
Users that are interested in feed_seeker are comparing it to the libraries listed below
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- A financial disclosure data extraction tool.☆13Updated last year
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆13Updated last week
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 3 months ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- A Python library for defining rule-based overrides on messy data☆13Updated 3 months ago
- Presentations on Quantified Self and Self-Tracking with Python☆29Updated 2 years ago
- Some tools to help analyze the twitter archive☆62Updated 6 months ago
- A browser extension providing Open Access bibliographical services☆17Updated 2 years ago
- scraper for facebook, gab, google and tiktok☆22Updated 8 months ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- A maximum-strength name parser for record linkage.☆36Updated last month
- A list of over 5000 US news domains and their social media accounts☆45Updated 2 years ago
- Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds☆79Updated 2 years ago
- America's most comprehensive dictionary of campaign finance jargon. A free resource created by and for data journalists.☆17Updated last week
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- ☆12Updated 5 years ago
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- Crawl sites for RSS, Atom, and JSON feeds.☆71Updated 9 months ago
- Examples for getting started using https://case.law☆65Updated 2 years ago
- Visualisation of browsing history patterns using pandas and seaborn☆10Updated 4 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- A helper library full of URL-related heuristics.☆66Updated 5 months ago
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- Scrape various open data directories to create an index of what's available out there☆36Updated last month
- This project is wraper for Leilex, legal entity identifier API. Includes ISIN-LEI conversion. Search LEI number using company name.☆24Updated 5 months ago
- A Google Trends Analytics Package☆13Updated 9 months ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 4 months ago