garystafford / datahub-on-aws-demo
DataHub on AWS demonstration resources
☆10Updated 2 years ago
Alternatives and similar repositories for datahub-on-aws-demo:
Users that are interested in datahub-on-aws-demo are comparing it to the libraries listed below
- This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS …☆19Updated 3 years ago
- Example Set up For DBT Cloud using Github Integrations☆11Updated 5 years ago
- ☆11Updated 5 months ago
- Glue VSCode devcontainer setup☆14Updated 2 years ago
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- 🐋 Docker image for AWS Glue Spark/Python☆23Updated last year
- Styles for dbt on the net☆10Updated 5 months ago
- AWS Quick Start Team☆18Updated 7 months ago
- Using the Parquet file format with Python☆15Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoring☆26Updated 8 months ago
- Kafka Connect playground☆10Updated 5 years ago
- This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenario…☆23Updated 5 months ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆16Updated 8 months ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆14Updated 2 weeks ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- Code to solve a open dataset of predictive maintanance of sheet brek on a paper mill.☆8Updated 4 years ago
- Utility functions for dbt projects running on Spark☆33Updated 2 months ago
- dbt package for monitoring airflow DAGs and tasks☆29Updated 2 months ago
- Operational Data Processing Framework developed using AWS Glue and Apache Hudi. This framework is suitable for Data Lake and Modern Data …☆22Updated last year
- ☆10Updated 2 months ago
- ☆30Updated last year
- This is a real-life, high throughput streaming ELT data pipeline for ecommerce☆13Updated last year
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated 2 years ago
- Delux Airflow deployment with Minikube☆10Updated 4 years ago
- An infrastructure as code approach to deploying Snowflake using Terraform☆25Updated last year
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆23Updated 7 months ago
- Big Data Demystified meetup and blog examples☆31Updated 8 months ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Updated this week