zekeriyyaa / Apache-Spark-Structured-Streaming-Via-Docker-Compose
☆12Updated last year
Alternatives and similar repositories for Apache-Spark-Structured-Streaming-Via-Docker-Compose:
Users that are interested in Apache-Spark-Structured-Streaming-Via-Docker-Compose are comparing it to the libraries listed below
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- ☆26Updated 4 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- PySpark Cheatsheet☆36Updated 2 years ago
- Airflow helm chart for AWS EKS☆18Updated 4 years ago
- ☆87Updated 2 years ago
- Sample Airflow DAGs☆62Updated 2 years ago
- Data Engineering with Spark and Delta Lake☆94Updated 2 years ago
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)☆22Updated 5 years ago
- Spark app to merge different schemas☆23Updated 4 years ago
- ☆38Updated 2 years ago
- PySpark-ETL☆23Updated 5 years ago
- Simplifying Data Engineering and Analytics with Delta, published by Packt☆21Updated last year
- code snippet for analytics sessions☆33Updated 2 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆48Updated 3 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆43Updated 2 years ago
- Book / Blog of different topics around MLOps engineering.☆9Updated 11 months ago
- ☆14Updated 5 years ago
- Delta Lake Documentation☆48Updated 7 months ago
- Simplify Big Data Analytics with Amazon EMR, published by Packt☆13Updated 2 years ago
- Spark and Delta Lake Workshop☆22Updated 2 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆173Updated 3 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆28Updated last year
- EverythingApacheNiFi☆106Updated last year
- ☆20Updated 5 years ago
- Snowflake Cookbook, published by Packt☆76Updated 2 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago