godatadriven / data-pipelines-with-airflow-2nd-ed
Code for the second edition of Data Pipelines with Apache Airflow Book
☆12Updated 2 weeks ago
Alternatives and similar repositories for data-pipelines-with-airflow-2nd-ed
Users that are interested in data-pipelines-with-airflow-2nd-ed are comparing it to the libraries listed below
Sorting:
- Skeleton project for Apache Airflow training participants to work on.☆16Updated 4 years ago
- Full stack data engineering tools and infrastructure set-up☆52Updated 4 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆77Updated 2 years ago
- Apache Airflow Best Practices, published by Packt☆41Updated 6 months ago
- The Python fake data producer for Apache Kafka® is a complete demo app allowing you to quickly produce JSON fake streaming datasets and …☆85Updated last year
- ☆49Updated 3 years ago
- Code snippets for Data Engineering Design Patterns book☆106Updated last month
- (project & tutorial) dag pipeline tests + ci/cd setup☆87Updated 4 years ago
- ☆20Updated 5 years ago
- Evaluation Matrix for Change Data Capture☆25Updated 9 months ago
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 4 years ago
- A Series of Notebooks on how to start with Kafka and Python☆154Updated 2 months ago
- Example repo to create end to end tests for data pipeline.☆24Updated 11 months ago
- Project for "Data pipeline design patterns" blog.☆45Updated 9 months ago
- ☆17Updated 6 months ago
- Simple stream processing pipeline☆102Updated 10 months ago
- ☆36Updated 2 years ago
- ☆87Updated 2 years ago
- ☆84Updated 2 years ago
- Sample Airflow DAGs to load data from the CovidTracking API to Snowflake via an AWS S3 intermediary.☆16Updated 4 years ago
- Public source code for the Udemy online course Apache Airflow: Complete Hands-On Beginner to Advanced Class.☆63Updated 4 years ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆82Updated last year
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
- Weekly Data Engineering Newsletter☆94Updated 10 months ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆13Updated 3 years ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 9 months ago
- Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.☆57Updated 2 years ago
- Data engineering with dbt, published by Packt☆77Updated last year
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago