charlesb / CDF-workshop
Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API into a live dashboard using NiFi, Kafka, Hive LLAP with Druid integration and Superset. This workshop will also cover steps to remotely manage MiNiFi to send data to NiFi using Edge Flow Manager (EFM).
☆20Updated 5 years ago
Alternatives and similar repositories for CDF-workshop:
Users that are interested in CDF-workshop are comparing it to the libraries listed below
- ☆28Updated last year
- HDF masterclass materials☆28Updated 9 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- A Spark datasource for the HadoopOffice library☆38Updated 2 years ago
- MonitoFi: Health & Performance Monitor for your Apache NiFi☆62Updated last year
- ☆32Updated 6 years ago
- Edge2AI Workshop☆69Updated 2 months ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Single view demo☆14Updated 9 years ago
- CDF Tech Bootcamp☆9Updated 5 years ago
- Code snippets used in demos recorded for the blog.☆30Updated this week
- ☆27Updated 2 months ago
- Examples for High Performance Spark☆15Updated 4 months ago
- A complete custom processor project, for your reference.☆18Updated 9 years ago
- Kafka sink for Kusto☆49Updated 2 weeks ago
- ☆16Updated 4 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Updated 4 years ago
- Materials for various Hadoop & Nifi related workshops☆19Updated 3 years ago
- Star Schema Benchmark using the Hive / Druid Integration☆30Updated 7 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago
- Postgresql configured to work as metastore for Hive.☆32Updated 2 years ago
- Hadoop Data Pipeline using Falcon☆15Updated 8 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- ☆25Updated 6 years ago
- TPCDS benchmark for various engines☆18Updated 3 years ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- MapReduce performance testing using teragen and terasort☆18Updated 3 years ago
- ☆39Updated 6 years ago
- Ambari stack service for installing and managing Apache Airflow on HDP cluster☆59Updated 6 years ago
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 8 years ago