shavit / crawlero
Distributed web crawlers. Fault tolerance, user-agent randomizer, RabbitMQ, Tor, PostgreSQL.
☆16Updated 7 years ago
Alternatives and similar repositories for crawlero:
Users that are interested in crawlero are comparing it to the libraries listed below
- A distributed system for mining common crawl using SQS, AWS-EC2 and S3☆18Updated 10 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 3 years ago
- A set of services for monitoring of multiple social media platforms based on Docker.☆16Updated 3 years ago
- Blesta Namesilo module☆12Updated 4 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆55Updated last year
- Watch live SMTP traffic in a web interface.☆15Updated last week
- Crawl websites for videos from Youtube, Vimeo, Soundcloud, etc☆31Updated 3 years ago
- Walmart Web Scraper written in Python 3 to extract coupon details for a store location☆14Updated 6 years ago
- Track the keyword positions☆18Updated 11 years ago
- Example how to pre-process news articles with textbox and index on Elastic Search☆13Updated 7 years ago
- Complete docker installation of Booktype 2.3.☆13Updated 3 years ago
- A rotating socks proxy using Tor, Delegate and Haproxy☆26Updated 10 years ago
- Finding good domains is difficult. That is why we have built this software : Expired Domains Finder. This software is free with no warran…☆9Updated 7 years ago
- A Selenium based automated program that scrapes profiles data,stores in CSV,follows them and saves their profile in PDF.☆32Updated last year
- Current CIDR-formatted list of unwanted bots caught on my systems.☆16Updated 3 years ago
- Google SEO scraper for "allintitle:keyword" queries.☆23Updated 10 years ago
- Catch the dropped domains with public api from Godaddy,Name,NameSilo,Dynadot etc☆50Updated 8 years ago
- Automates the process of repeatedly searching for a website via scraped proxy IP and search keywords☆44Updated last year
- Used twitter api to parse crypto tags as part of the research.☆17Updated 4 years ago
- ☆14Updated 2 years ago
- Integrate Watson Studio and Watson Campaign Automation to tailor your target audience for effective campaigns☆12Updated 3 years ago
- A tool that is built using several open source services and uses Open AI's GPT-2 as a base model.☆4Updated 2 years ago
- Sending SMS via User Profiles☆9Updated last week
- A docker networking driver that transparently tunnels docker containers TCP traffic through a proxy☆29Updated 6 years ago
- ☆15Updated 2 years ago
- An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.☆19Updated 3 years ago
- Quora Question Scraper - Find & Export relevant Questions 10x faster☆16Updated 5 years ago
- My own collection of bash scripts☆17Updated last year
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆44Updated 7 years ago
- Decentralized web archiving☆19Updated 6 years ago