DBeath / feedsearch-crawler
Crawl sites for RSS, Atom, and JSON feeds.
☆74Updated 10 months ago
Alternatives and similar repositories for feedsearch-crawler:
Users that are interested in feedsearch-crawler are comparing it to the libraries listed below
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 6 months ago
- Extract text from HTML☆135Updated 4 years ago
- Add website scraping abilities to Datasette☆62Updated 2 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- A News Article Collection Library☆22Updated 2 years ago
- A natural language date parser. (Python version of chrono.js)☆25Updated 10 months ago
- News API - fetch news from CommonCrawl, parse with NewsPlease, enrich with pre-trained machine-learning models, to structured searchable …☆28Updated 2 years ago
- Wikidata's QRank as a SQLite DB.☆28Updated last year
- Python code to scrape and collect data from the RSS feeds Facebook uses to augment its Trending Section☆57Updated 6 years ago
- A helper library full of URL-related heuristics.☆69Updated 2 weeks ago
- LLM plugin for embeddings using sentence-transformers☆55Updated 2 weeks ago
- Inspect a URL and estimate if it contains a news story☆39Updated 4 months ago
- Automatically extracts and normalizes an online article or blog post publication date☆117Updated last year
- ☆27Updated 6 months ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- ☆14Updated 8 months ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆124Updated 3 months ago
- python api wrapper for https://mercury.postlight.com/web-parser/☆23Updated last year
- Python port of Boilerpipe library☆86Updated 7 months ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Pre-built template for using newspaper3k on aws lambda☆17Updated 2 years ago
- Cloud crawler functions for scrapeulous☆45Updated 4 years ago
- Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more☆20Updated 6 years ago
- advertools crawler UI☆28Updated 2 years ago
- Checks for keyword similarity - then generates unique keywords and a spreadsheet with similar keywords and similarity☆20Updated last year
- Generate a list of your GitHub stars by topic - automatically!☆76Updated 2 years ago
- A maximum-strength name parser for record linkage.☆36Updated last week
- Quora Question Scraper - Find & Export relevant Questions 10x faster☆16Updated 5 years ago
- Source real estate prices from the Common Crawl.☆27Updated 6 years ago
- Python Script for Copywriters to Gather Data from Competing Content and Find Keyword Overlap☆12Updated 2 years ago