skoonData / docker-compose
☆11Updated 3 years ago
Alternatives and similar repositories for docker-compose:
Users that are interested in docker-compose are comparing it to the libraries listed below
- ☆26Updated 7 months ago
- ☆21Updated 2 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Updated 4 years ago
- ☆47Updated last year
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆45Updated last year
- ☆86Updated 2 months ago
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆33Updated 4 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- ☆14Updated 2 years ago
- A shell script to automate the operations of sqoop☆11Updated 4 years ago
- An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS ap…☆25Updated 2 years ago
- Spark implementation of Slowly Changing Dimension type 2☆11Updated 6 years ago
- Delta-Lake, ETL, Spark, Airflow☆47Updated 2 years ago
- For Udemy students: the official repository of Rock the JVM's Spark Streaming course☆26Updated 2 years ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆23Updated 3 years ago
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆49Updated last year
- Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.☆35Updated last year
- ☆24Updated 3 years ago
- The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Pos…☆63Updated 2 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆60Updated last year
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆40Updated 5 years ago
- Apache Spark for data engineers☆56Updated 2 years ago
- ☆21Updated last month
- This is a GitHub for all of my NiFi Templates☆47Updated 4 years ago
- Companion repository for the book 'Delta Lake Up and Running'☆46Updated 3 weeks ago
- Delta Lake examples☆224Updated 6 months ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆41Updated last year
- The source code for the book Modern Data Engineering with Apache Spark☆36Updated 2 years ago
- Data engineering with dbt, published by Packt☆77Updated last year
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆68Updated 4 years ago