paypay / DataEngineerChallengeLinks
☆23Updated 3 years ago
Alternatives and similar repositories for DataEngineerChallenge
Users that are interested in DataEngineerChallenge are comparing it to the libraries listed below
Sorting:
- (project & tutorial) dag pipeline tests + ci/cd setup☆90Updated 4 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- Magic to help Spark pipelines upgrade☆34Updated last year
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Updated 4 years ago
- A Python PySpark Projet with Poetry☆24Updated 6 months ago
- A repository of sample code to accompany our blog post on Airflow and dbt.☆183Updated 2 years ago
- Repository used for Spark Trainings☆54Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Spark and Delta Lake Workshop☆22Updated 3 years ago
- Weekly Data Engineering Newsletter☆96Updated last year
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆32Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56Updated 2 years ago
- Airflow training for the crunch conf☆105Updated 7 years ago
- Code snippets used in demos recorded for the blog.☆37Updated 3 weeks ago
- Filling in the Spark function gaps across APIs☆50Updated 4 years ago
- An Introduction to Scala☆23Updated 2 years ago
- Various data stream/batch process demo with Apache Scala Spark 🚀☆11Updated 5 years ago
- Step-by-step tutorial on building a Kimball dimensional model with dbt☆163Updated last year
- PySpark data-pipeline testing and CICD☆28Updated 5 years ago
- A simple Spark-powered ETL framework that just works 🍺☆183Updated 4 months ago
- Airflow Unit Tests and Integration Tests☆261Updated 3 years ago
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆420Updated this week
- Spark style guide☆272Updated last year
- Repository of sample Databricks notebooks☆277Updated last year
- scaffold of Apache Airflow executing Docker containers☆85Updated 3 years ago
- The official repository for the Rock the JVM Spark Optimization with Scala course☆58Updated 2 years ago
- The official repository for the Rock the JVM Spark Optimization 2 course☆42Updated 2 years ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 3 years ago
- Building Big Data Pipelines with Apache Beam, published by Packt☆89Updated 2 years ago
- Batch Processing , orchestration using Apache Airflow and Google Workflows, spark structured Streaming and a lot more☆18Updated 3 years ago