dansandland / cassandra-scrapy-pipelineLinks
☆15Updated 9 years ago
Alternatives and similar repositories for cassandra-scrapy-pipeline
Users that are interested in cassandra-scrapy-pipeline are comparing it to the libraries listed below
Sorting:
- A platform for real-time streaming search☆102Updated 9 years ago
- a scaleable and efficient crawelr with docker cluster , crawl million pages in 2 hours with a single machine☆97Updated last year
- Code reference from my Qbox blog posts.☆87Updated 10 years ago
- Automatic Item List Extraction☆87Updated 9 years ago
- Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…☆55Updated last year
- Text classification using Naive Bayes and Elasticsearch☆154Updated 9 years ago
- High Level Kafka Scanner☆19Updated 7 years ago
- Scrapy extension which writes crawled items to Kafka☆30Updated 6 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- An extension of the kafka-python package that adds features like multiprocess consumers.☆39Updated last year
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Let's perform Twitter sentiment analysis using Python, Docker, Elasticsearch, and Kibana!☆137Updated 5 years ago
- Run PredictionIO inside Docker☆200Updated 6 years ago
- A scrapy pipeline which send items to Elastic Search server☆98Updated 7 years ago
- PredictionIO Python SDK☆196Updated 7 years ago
- Film recommendations with Apache Spark and Python☆61Updated 10 years ago
- Fast, easy and intuitive machine learning prototyping.☆124Updated 11 years ago
- CustomerML is an open source customer science platform leveraging the power of Predictiveworks and fully integrated with Elasticsearch an…☆48Updated 10 years ago
- ☆146Updated 9 years ago
- Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)☆32Updated 7 years ago
- This is a simple streaming application that utilises Kafka and Python☆46Updated 6 years ago
- Docker container for PredictionIO-based machine learning services☆73Updated last year
- My capstone project for Galvanize (Zipfian Academy)☆38Updated 6 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated 10 years ago
- Deprecated. Formerly: scripts to make it easier to set up and manipulate clusters at Amazon EC2☆110Updated 13 years ago
- HopsYARN Tensorflow Framework.☆31Updated 5 years ago
- docker scrapyd scrapy boot2docker crawler - a spider Python application that can be "Dockerized".☆42Updated 10 years ago
- PredictionIO E-Commerce Recommendation Engine Template (Scala-based parallelized engine)☆74Updated 6 years ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 9 years ago