chuqbach / Big-Data-Installation
The Complete Big Data Installation Solutions
☆15Updated last year
Alternatives and similar repositories for Big-Data-Installation:
Users that are interested in Big-Data-Installation are comparing it to the libraries listed below
- Simple stream processing pipeline☆99Updated 9 months ago
- ☆14Updated 2 years ago
- Open source stack lakehouse☆25Updated last year
- Code snippets for Data Engineering Design Patterns book☆75Updated 2 weeks ago
- ☆45Updated 4 years ago
- Data Engineering Handbook for beginners and everyone☆49Updated 8 months ago
- This repo is mostly created for pyspark and hive related interview questions.☆47Updated 3 years ago
- ☆87Updated 2 years ago
- ☆15Updated 2 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆53Updated last year
- Nyc_Taxi_Data_Pipeline - DE Project☆103Updated 5 months ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆63Updated last year
- A Python PySpark Projet with Poetry☆23Updated 6 months ago
- My Setup Development Environment as Data Engineer☆24Updated 2 weeks ago
- PySpark Cheatsheet☆36Updated 2 years ago
- Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and tr…☆10Updated last year
- Code for dbt tutorial☆155Updated 10 months ago
- Local Environment to Practice Data Engineering☆144Updated 3 months ago
- This repo contains a spark standalone cluster on docker for anyone who wants to play with PySpark by submitting their applications.☆32Updated last year
- End to end data engineering project☆53Updated 2 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆45Updated 5 years ago
- ☆261Updated 5 months ago
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆14Updated 3 years ago
- End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore,…☆34Updated 5 months ago
- Ravi Azure ADB ADF Repository☆65Updated 2 months ago
- Spark, Airflow, Kafka☆26Updated last year
- ☆11Updated 4 years ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- ☆36Updated 2 years ago
- A custom end-to-end analytics platform for customer churn☆11Updated 2 months ago