charlesb / CDF-workshopLinks
Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API into a live dashboard using NiFi, Kafka, Hive LLAP with Druid integration and Superset. This workshop will also cover steps to remotely manage MiNiFi to send data to NiFi using Edge Flow Manager (EFM).
☆19Updated 6 years ago
Alternatives and similar repositories for CDF-workshop
Users that are interested in CDF-workshop are comparing it to the libraries listed below
Sorting:
- ☆27Updated last year
- Spark and Delta Lake Workshop☆22Updated 3 years ago
- Apache Spark Connector for SQL Server and Azure SQL☆287Updated 10 months ago
- An Azure Function which allows Azure Data Factory (ADF) to connect to Snowflake in a flexible way.☆26Updated 2 years ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆78Updated this week
- TPCDS benchmark for various engines☆18Updated 3 years ago
- DataQuality for BigData☆145Updated 2 years ago
- Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs☆238Updated 10 months ago
- Delta Lake Documentation☆51Updated last year
- dbt adapter for Azure Synapse Dedicated SQL Pools☆76Updated 4 months ago
- A simple Spark-powered ETL framework that just works 🍺☆181Updated 3 months ago
- Edge2AI Workshop☆70Updated 6 months ago
- Examples for High Performance Spark☆16Updated 2 months ago
- An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset☆110Updated 2 years ago
- Example code for doing DataOps☆49Updated 4 years ago
- Databricks Platform - Architecture, Security, Automation and much more!!☆51Updated this week
- Delta Lake examples☆235Updated last year
- Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline☆152Updated last year
- Testing framework for Databricks notebooks☆312Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆123Updated 3 weeks ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Updated 6 years ago
- Client library for Azure Databricks☆84Updated 3 weeks ago
- dbt adapter for dbt serverless pools☆13Updated 2 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆279Updated 2 months ago
- This project provides a client library that allows Azure SQL DB or SQL Server to act as an input source or output sink for Spark jobs.☆76Updated 5 years ago
- ☆32Updated 6 years ago
- A simplified, lightweight ETL Framework based on Apache Spark☆586Updated last year
- ☆76Updated last year
- The Taxonomy for ETL Automation Metadata (TEAM) is a tool for design metadata management geared towards data warehouse automation. It is …☆37Updated 10 months ago