mediacloud / feed_seeker
Find rss, atom, xml, and rdf feeds on webpages
☆30Updated 6 months ago
Alternatives and similar repositories for feed_seeker:
Users that are interested in feed_seeker are comparing it to the libraries listed below
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- A financial disclosure data extraction tool.☆15Updated last year
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 4 months ago
- ☆11Updated 5 years ago
- America's most comprehensive dictionary of campaign finance jargon. A free resource created by and for data journalists.☆17Updated 2 weeks ago
- A maximum-strength name parser for record linkage.☆36Updated 2 weeks ago
- Some tools to help analyze the twitter archive☆62Updated 7 months ago
- A Python library for defining rule-based overrides on messy data☆13Updated 4 months ago
- A collection of projects I did while at General Assembly Singapore - as part of Data Science Immersive☆11Updated 4 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- Python based Wikidata framework for easy dataframe extraction☆43Updated last year
- A browser extension providing Open Access bibliographical services☆17Updated 2 years ago
- Materials to reproduce findings in our story, "Google’s Top Search Result? Surprise! It’s Google"☆34Updated 4 years ago
- Scrape various open data directories to create an index of what's available out there☆36Updated 2 months ago
- Named-Entity Recognition extension for OpenRefine☆27Updated 2 years ago
- Getting, analysing and displaying lists of papers☆15Updated 6 months ago
- Presentations on Quantified Self and Self-Tracking with Python☆30Updated 2 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Pull out versions of specific files from a gitscraping repo into individual files.☆15Updated 3 years ago
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 4 years ago
- Record Linkage ToolKit (Find and link entities)☆110Updated last year
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- NLRB data scraper by LexPredict☆12Updated 2 years ago
- Open Access PDF harvester☆39Updated 11 months ago
- Extract networks of entities from journalistic reporting☆48Updated last year