garystafford / emr-msk-serverless-demo
Amazon EMR Serverless and Amazon MSK Serverless Demo
☆13Updated 2 years ago
Alternatives and similar repositories for emr-msk-serverless-demo:
Users that are interested in emr-msk-serverless-demo are comparing it to the libraries listed below
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆28Updated last year
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆23Updated 5 months ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- ☆34Updated 2 years ago
- dbt / Amazon Redshift Demonstration Project☆33Updated 2 years ago
- ☆12Updated 2 years ago
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- Utility functions for dbt projects running on Spark☆31Updated last year
- Materials for the next course☆24Updated last year
- ☆29Updated 10 months ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Updated this week
- Glue VSCode devcontainer setup☆14Updated last year
- Spark data pipeline that processes movie ratings data.☆27Updated this week
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated last year
- All the Snowflake Virtual Warehouse - Example☆11Updated 4 years ago
- Cloned by the `dbt init` task☆60Updated 8 months ago
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆20Updated 4 months ago
- This is a collecton of Amazon CDK projects to show how to directly ingest streaming data from Amazon Mananged Service for Apache Kafka (M…☆11Updated 4 months ago
- Covid19 and Iowa Liquor Sales analysis at BigQuery using dbt, Airflow, Marquez, Google Cloud and other modern data stack tools☆14Updated 2 years ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆157Updated last week
- Delta Lake Documentation☆48Updated 7 months ago
- dbt sample project for Snowflake using the `TPCH` dataset that ships as a shared database with Snowflake.☆21Updated 2 years ago
- 🐋 Docker image for AWS Glue Spark/Python☆23Updated last year
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆65Updated last year
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated this week
- Sample Airflow DAGs☆61Updated 2 years ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆65Updated 3 years ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆48Updated 2 years ago