developershomes / SparkETLLinks

Spark all the ETL Pipelines

☆36

Alternatives and similar repositories for SparkETL

Users that are interested in SparkETL are comparing it to the libraries listed below

Sorting:

josephmachado / beginner_de_project_stream
Simple stream processing pipeline
☆110Updated last year
josephmachado / data_engineering_best_practices
Sample project to demonstrate data engineering best practices
☆200Updated last year
bartosz25 / data-engineering-design-patterns-book
Code snippets for Data Engineering Design Patterns book
☆275Updated 8 months ago
josephmachado / efficient_data_processing_spark
Code for "Efficient Data Processing in Spark" Course
☆347Updated last month
dominikhei / Local-Data-LakeHouse
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…
☆75Updated 2 years ago
dipankarmazumdar / awesome-lakehouse-guide
Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture
☆124Updated 2 weeks ago
josephmachado / online_store
End to end data engineering project
☆57Updated 3 years ago
cordon-thiago / airflow-spark
Docker with Airflow and Spark standalone cluster
☆262Updated 2 years ago
josephmachado / data_engineering_project_template
A template repository to create a data project with IAC, CI/CD, Data migrations, & testing
☆281Updated last year
TJaniF / airflow-elt-blueprint
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
☆79Updated 2 years ago
Armaan1Gohil / dataengineering-tech-stack
Local Environment to Practice Data Engineering
☆143Updated 10 months ago
josephmachado / adv_data_transformation_in_sql
Code for "Advanced data transformations in SQL" free live workshop
☆88Updated 6 months ago
Amrit-Hub / Databricks-Certified-Data-Engineer-Professional-Questions
This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.
☆120Updated last year
JesusAcuna / data-engineering-project
☆29Updated 2 years ago
josephmachado / simple_dbt_project
Code for dbt tutorial
☆165Updated 2 months ago
afaqueahmad7117 / spark-experiments
Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews
☆177Updated 2 months ago
subhamkharwal / ease-with-apache-spark
Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand
☆56Updated 2 years ago
hnawaz007 / dbt-dw
build dw with dbt
☆49Updated last year
Data-Engineer-Camp / dbt-dimensional-modelling
Step-by-step tutorial on building a Kimball dimensional model with dbt
☆154Updated last year
abdkumar / spotify-stream-analytics
Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…
☆69Updated last year
manuel-lang / Data-Engineering-Nanodegree
Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…
☆57Updated 3 years ago
cartershanklin / pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
☆480Updated last year
martandsingh / ApacheSpark
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…
☆103Updated 2 months ago
josephmachado / bitcoinMonitor
Near real time ETL to populate a dashboard.
☆73Updated 2 months ago
alanchn31 / Movalytics-Data-Warehouse
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
☆158Updated 5 years ago
MarcosMJD / ghcn-d
Data Pipeline from the Global Historical Climatology Network DataSet
☆27Updated 2 years ago
Snowflake-Labs / sfguide-data-engineering-with-snowpark-python
☆141Updated 9 months ago
supratim94336 / DataEngineeringCapstoneProject
😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS
☆50Updated 6 years ago
tobyweb3x / data-engineering-book-reviews
Just starting your DE journey or along the way already?. I will be sharing a short list of DATA-ENGINEERING-CENTRED books that covers the…
☆34Updated 3 years ago
josephmachado / python_essentials_for_data_engineers
Code for blog at https://www.startdataengineering.com/post/python-for-de/
☆91Updated last year