skoonData / docker-composeLinks
☆12Updated 4 years ago
Alternatives and similar repositories for docker-compose
Users that are interested in docker-compose are comparing it to the libraries listed below
Sorting:
- ☆25Updated last year
- ☆14Updated 2 years ago
- Open episode of the data engineering practice course☆29Updated last year
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆49Updated last year
- Multi-container environment with Hadoop, Spark and Hive☆224Updated 6 months ago
- ☆21Updated 2 years ago
- ☆16Updated 8 months ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆495Updated this week
- Docker with Airflow and Spark standalone cluster☆261Updated 2 years ago
- Surfalytics projces on Data Engineering and Analytics☆114Updated 2 weeks ago
- dbt module for myBI connect☆13Updated 2 years ago
- Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines☆134Updated 3 years ago
- Apache Spark 3 - Structured Streaming Course Material☆124Updated 2 years ago
- ☆47Updated 2 years ago
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆34Updated 5 years ago
- ☆91Updated 9 months ago
- All demo code for the Udemy course "Programming in Snowflake".☆25Updated last year
- Spark all the ETL Pipelines☆35Updated 2 years ago
- An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS ap…☆25Updated 2 years ago
- End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore,…☆46Updated last year
- Apache Spark for data engineers☆56Updated 3 years ago
- For Udemy students: the official repository of Rock the JVM's Spark Streaming course☆26Updated 2 years ago
- Simple stream processing pipeline☆110Updated last year
- This repo contains a spark standalone cluster on docker for anyone who wants to play with PySpark by submitting their applications.☆37Updated 2 years ago
- Companion repository for the book 'Delta Lake Up and Running'☆47Updated 7 months ago
- The source code for the book Modern Data Engineering with Apache Spark☆38Updated 3 years ago
- Local Environment to Practice Data Engineering☆141Updated 10 months ago
- The simple ETL with docker container☆61Updated 5 months ago
- ☆12Updated 4 years ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆74Updated 2 years ago