dansandland / cassandra-scrapy-pipeline
☆15Updated 9 years ago
Alternatives and similar repositories for cassandra-scrapy-pipeline:
Users that are interested in cassandra-scrapy-pipeline are comparing it to the libraries listed below
- PySpark for Elastic Search☆55Updated 7 years ago
- a scaleable and efficient crawelr with docker cluster , crawl million pages in 2 hours with a single machine☆97Updated 10 months ago
- Scrapy extension which writes crawled items to Kafka☆30Updated 6 years ago
- High Level Kafka Scanner☆19Updated 7 years ago
- dllib is a distributed deep learning library running on Apache Spark☆32Updated 7 years ago
- HopsYARN Tensorflow Framework.☆32Updated 5 years ago
- Spark Application : Spark Summit 2018 : Streaming Trend Discovery☆11Updated 6 years ago
- Code reference from my Qbox blog posts.☆87Updated 9 years ago
- A platform for real-time streaming search☆103Updated 8 years ago
- A cookiecutter template for Apache Spark applications written in Scala☆10Updated 6 years ago
- CustomerML is an open source customer science platform leveraging the power of Predictiveworks and fully integrated with Elasticsearch an…☆47Updated 9 years ago
- An extension of the kafka-python package that adds features like multiprocess consumers.☆39Updated last year
- ☆23Updated 7 years ago
- Open source analytics platform powered by Apache Cassandra, Spark, and Kafka☆34Updated 9 years ago
- Python Client for WebHDFS REST API☆43Updated 9 years ago
- Beyond Piwik Analytics with Scala and Apache Spark☆45Updated 10 years ago
- A curated list of awesome Apache Spark packages and resources.☆40Updated 7 years ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 8 years ago
- A collection of datasets and databases☆24Updated 6 years ago
- A javascript shell for elasticsearch☆105Updated 9 years ago
- Slides to learn a little natural language processing (NLP) with Python. Written in reST with S5/Docutils.☆28Updated 12 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- An Apache Spark-shell backend for IPython☆105Updated 3 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- ☆146Updated 8 years ago
- Luigi Plugin for Hubot☆35Updated 8 years ago
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 9 years ago
- A custom SimilarityProvider example for Elasticsearch☆36Updated 9 years ago