Parsely / pyspark-cassandraLinks
Utilities and examples to asssist in working with PySpark and Cassandra.
☆36Updated 10 years ago
Alternatives and similar repositories for pyspark-cassandra
Users that are interested in pyspark-cassandra are comparing it to the libraries listed below
Sorting:
- Open source analytics platform powered by Apache Cassandra, Spark, and Kafka☆34Updated 10 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 11 years ago
- A real time streaming implementation of markov chain based fraud detection☆23Updated 10 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- Automates Spark standalone cluster tasks with Puppet and Fabric.☆43Updated 10 years ago
- Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).☆15Updated 4 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 7 years ago
- Exelixi is a distributed framework for running genetic algorithms at scale. The framework is based on Apache Mesos and the code is mostly…☆34Updated 11 years ago
- An Apache Spark-shell backend for IPython☆105Updated 3 years ago
- On demand presto cluster with mesos, marathon and docker.☆30Updated 7 years ago
- Cascading on Apache Flink®☆54Updated last year
- Python bindings for TrailDB☆39Updated 5 years ago
- Apache Zeppelin on Kubernetes.☆28Updated 6 years ago
- ☆23Updated 7 years ago
- machine learning playground☆12Updated 8 years ago
- ☆9Updated 9 years ago
- Benchmarks of artificial neural network library for Spark MLlib☆11Updated 9 years ago
- High Level Kafka Scanner☆19Updated 7 years ago
- Reduce your data. A unix filter for algebird-powered aggregation.☆138Updated 8 years ago
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 4 years ago
- Python Client for WebHDFS REST API☆43Updated 10 years ago
- Ferry lets you define, run, and deploy big data applications on AWS, OpenStack, and your local machine using Docker☆253Updated 10 years ago
- Tail a log file and send log lines automatically to a kafka topic☆57Updated 12 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- Docker container to locally run Spark and Kafka☆15Updated 8 years ago
- A place for all things Pivotal & R☆25Updated 3 years ago
- An example project for doing grid search in MLlib☆13Updated 10 years ago
- A distributed in-memory fabric based on shared-memory blocks and datashape. Any language can operate on the data.☆13Updated 9 years ago