charlesb / CDF-workshop
Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API into a live dashboard using NiFi, Kafka, Hive LLAP with Druid integration and Superset. This workshop will also cover steps to remotely manage MiNiFi to send data to NiFi using Edge Flow Manager (EFM).
☆20Updated 5 years ago
Alternatives and similar repositories for CDF-workshop:
Users that are interested in CDF-workshop are comparing it to the libraries listed below
- Edge2AI Workshop☆68Updated this week
- ☆27Updated 11 months ago
- ☆32Updated 5 years ago
- Code snippets used in demos recorded for the blog.☆29Updated this week
- HDF masterclass materials☆28Updated 8 years ago
- Examples for High Performance Spark☆15Updated 2 months ago
- MonitoFi: Health & Performance Monitor for your Apache NiFi☆62Updated last year
- A Spark datasource for the HadoopOffice library☆39Updated 2 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- Spark and Delta Lake Workshop☆22Updated 2 years ago
- Hadoop Data Pipeline using Falcon☆15Updated 8 years ago
- A modern real-time streaming application serving as a reference framework for developing a big data pipeline, complete with a broad range…☆41Updated 4 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated 11 months ago
- A general purpose framework for automating Cloudera Products☆66Updated last month
- Materials for various Hadoop & Nifi related workshops☆52Updated 5 years ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆25Updated 5 years ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆74Updated this week
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- An opinionated auto-deployer for the Hortonworks Platform☆34Updated 3 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- ☆10Updated 2 years ago
- This repository contains NiFi processors for interacting with Snowflake Cloud Data Platform.☆12Updated last month
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- PDF DataSource for Apache Spark☆28Updated 3 weeks ago
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago