daddydrac / PySpark-Confluent-Kafka-Apache-Drill-

A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill using Docker and Cassandra (NoSQL DB) for storage; This allows for for fast feature engineering and data cleaning.
26Updated 5 years ago

Alternatives and similar repositories for PySpark-Confluent-Kafka-Apache-Drill-:

Users that are interested in PySpark-Confluent-Kafka-Apache-Drill- are comparing it to the libraries listed below