Parsely / pyspark-cassandra
Utilities and examples to asssist in working with PySpark and Cassandra.
☆36Updated 10 years ago
Alternatives and similar repositories for pyspark-cassandra:
Users that are interested in pyspark-cassandra are comparing it to the libraries listed below
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 11 years ago
- Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).☆15Updated 4 years ago
- Open source analytics platform powered by Apache Cassandra, Spark, and Kafka☆34Updated 9 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- Automates Spark standalone cluster tasks with Puppet and Fabric.☆43Updated 10 years ago
- Luigi Plugin for Hubot☆35Updated 8 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- Docker image for apache zeppelin☆38Updated 7 years ago
- Cascading on Apache Flink®☆54Updated last year
- A DC/OS time series demo☆62Updated 9 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- GPU Acceleration for Apache Spark☆34Updated 9 years ago
- Tail a log file and send log lines automatically to a kafka topic☆57Updated 12 years ago
- Deploy Dask on Marathon☆10Updated 8 years ago
- On demand presto cluster with mesos, marathon and docker.☆30Updated 7 years ago
- People. Places. Things. Graphs.☆92Updated 10 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- A Cascading Workflow Visualizer☆83Updated last year
- Periscope brings SLA policy based autoscaling to Hadoop☆35Updated 9 years ago
- A compiler for Pig Latin to Spark and Flink.☆23Updated 5 years ago
- Machine Learning for Cascading☆82Updated 9 years ago
- just put my data in a database!☆39Updated 9 years ago
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 7 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- A distributed in-memory fabric based on shared-memory blocks and datashape. Any language can operate on the data.☆13Updated 9 years ago
- ☆111Updated 7 years ago
- ☆24Updated 9 years ago
- A platform for real-time streaming search☆103Updated 9 years ago
- Data Science Research Architecture, Data Center OS☆21Updated 8 years ago
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 4 years ago