alexksikes / mass-scraping
Quickly download and scrape websites on a massive scale.
☆63Updated 12 years ago
Related projects ⓘ
Alternatives and complementary repositories for mass-scraping
- Scrapes sites. Gets news. Eventually events.☆82Updated 8 years ago
- scraping from walmart, target and homedepot website and getting data from amazon api☆15Updated 8 years ago
- ScraperWiki Python library for scraping and saving data☆159Updated last year
- Scrape the Google search result with Scrapy.☆98Updated 4 years ago
- Send text when a new Craigslist posting matches a given keyword or phrase☆96Updated 9 years ago
- Collection of python scripts I have created to crawl various websites, mostly for lead generation projects to match keywords and collect …☆129Updated last year
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- A library to interface with the Linkscape API.☆41Updated 6 years ago
- SEO Tool to track ranking of keywords on search engines (google app engine application)☆47Updated 12 years ago
- Streaming web crawler with WebSocket API☆44Updated last year
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated 9 months ago
- Source real estate prices from the Common Crawl.☆27Updated 6 years ago
- [UNMAINTAINED] Firefox addon for Scrapely☆5Updated 8 years ago
- Python code to scrape and collect data from the RSS feeds Facebook uses to augment its Trending Section☆57Updated 6 years ago
- ☆36Updated last year
- Social media monitoring tools such as sentiment analysis, keyword tracking and more☆46Updated 10 years ago
- A modular template for scraping data from the web to send yourself scheduled email reports☆40Updated 4 years ago
- legacy backend for Open States☆87Updated 4 years ago
- Using Scrapy to get company profiles from http://crunchbase.com☆31Updated 11 years ago
- Python wrapper around the SEMrush API.☆30Updated last year
- Python scripts for scraping bus ticket data from the websites of BoltBus, Greyhound, Megabus, GoBus, Amtrak, Peterpan, and EasternTravel.☆39Updated 4 years ago
- Crawler and scraper of the public directory of companies on LinkedIn.☆25Updated 5 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆44Updated 6 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 3 years ago
- A Python desktop application that makes the use of freelancing subreddits easier and faster.☆35Updated 4 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40Updated 6 months ago
- Take streaming tweets, extract hashtags & usernames, create graph, export graphml for Gephi visualisation☆33Updated 11 years ago
- An automatic proxy rotator - multithreaded & SSL☆78Updated 3 years ago