DBeath / feedsearch-crawler
Crawl sites for RSS, Atom, and JSON feeds.
☆69Updated 8 months ago
Alternatives and similar repositories for feedsearch-crawler:
Users that are interested in feedsearch-crawler are comparing it to the libraries listed below
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 4 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Search sites for RSS, Atom, and JSON feeds.☆19Updated 2 years ago
- Search for words, documents, images, videos, news and maps using the Brave search engine. Downloading files and images to a local hard dr…☆50Updated 9 months ago
- A Google Trends Analytics Package☆13Updated 8 months ago
- Extract text from HTML☆133Updated 4 years ago
- ScrapingAnt API client for Python.☆36Updated 7 months ago
- A News Article Collection Library☆22Updated last year
- Building a Job Dataset☆21Updated 2 years ago
- 💬NLP - Library for splitting email content into a human-written body and an automatically appended signature.☆25Updated 6 years ago
- Spider templates for automatic crawlers.☆27Updated 2 weeks ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated last year
- Zyte Automatic Extraction integration for Scrapy☆56Updated 3 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆55Updated last year
- Crawler of the "Apple Store" podcasts directory☆9Updated 7 years ago
- LLM plugin for embeddings using sentence-transformers☆48Updated last week
- Scrape various open data directories to create an index of what's available out there☆36Updated last week
- A maximum-strength name parser for record linkage.☆36Updated last week
- Wikidata's QRank as a SQLite DB.☆28Updated last year
- Releases for the reddit-graph project☆18Updated 7 months ago
- A repository demonstrating the use of real-estate-scrape to store the estimated value of a property on Redfin and Zillow every night usin…☆30Updated this week
- A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.☆112Updated last year
- Python package for converting xml and epubs to text files☆34Updated 4 years ago
- Python Script for Copywriters to Gather Data from Competing Content and Find Keyword Overlap☆12Updated 2 years ago
- Web Page Inspection Tool UI. Google SERP Preview, Sentiment Analysis, Keyword Extraction, Named Entity Recognition & Spell Check☆24Updated 2 years ago
- Library that helps use puppeteer in scrapy.☆52Updated 3 weeks ago
- ☆13Updated 5 years ago
- 📖 Using deep learning and scraping to analyze/summarize articles! Just drop in any URL!☆19Updated 2 years ago
- Find similar photos with Python 🐍☆12Updated 2 years ago