jaehyeon-kim / kafka-pocs
Apache Kafka and Related Projects
☆23Updated 6 months ago
Related projects: ⓘ
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆43Updated last year
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆65Updated 2 years ago
- Materials for the next course☆22Updated last year
- Access Amazon MSK from Amazon EKS using Terraform and Helm.☆25Updated 3 years ago
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects☆38Updated 3 months ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Spark on Kubernetes samples☆20Updated 3 years ago
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- The Python fake data producer for Apache Kafka® is a complete demo app allowing you to quickly produce JSON fake streaming datasets and …☆81Updated 4 months ago
- Operational Data Processing Framework developed using AWS Glue and Apache Hudi. This framework is suitable for Data Lake and Modern Data …☆21Updated last year
- Stream Processing Workshop☆20Updated last month
- Terraform module to create AWS EMR resources 🇺🇦☆23Updated last month
- ☆34Updated last year
- Data Pipeline for CDC data from MySQL DB to Amazon OpenSearch Service through Amazon Kinesis using Amazon Data Migration Service(DMS).☆25Updated last month
- Demo for GitHub Universe 2022☆12Updated last year
- AWS Quick Start Team☆18Updated 10 months ago
- ☆17Updated 2 weeks ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆22Updated last year
- Spark ETL example processing New York taxi rides public dataset on EKS☆42Updated last year
- Docker envinroment to stream data from Kafka to Iceberg tables☆24Updated 6 months ago
- Intended for internal use: deploys all infrastructure required for Astronomer to run on GCP☆10Updated last month
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆25Updated 5 months ago
- Amazon EMR Serverless and Amazon MSK Serverless Demo☆13Updated 2 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines☆36Updated last year
- dbt / Amazon Redshift Demonstration Project☆30Updated last year
- Build DataOps platform with Apache Airflow and dbt on AWS☆51Updated 3 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆166Updated 2 years ago
- Sample Airflow DAGs☆60Updated last year
- ☆3Updated last year
- ☆11Updated this week