dushyantkhosla / airflow4ds
Using Apache Airflow to author, run and monitor complex data pipelines.
☆12Updated 5 years ago
Related projects: ⓘ
- Airflow training for the crunch conf☆105Updated 5 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Updated 2 years ago
- Repository used for Spark Trainings☆53Updated last year
- ☆83Updated last year
- ☆39Updated this week
- (project & tutorial) dag pipeline tests + ci/cd setup☆84Updated 3 years ago
- MLFlow Spark Summit 2019 Presentation☆67Updated 5 years ago
- python automatic data quality check toolkit☆283Updated 4 years ago
- PySpark phonetic and string matching algorithms☆35Updated 7 months ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆100Updated 5 years ago
- ☆43Updated last year
- ☆16Updated 6 years ago
- Code to build a simple analytics data pipeline with Python☆102Updated 7 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆167Updated 10 months ago
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 4 years ago
- A short workshop on datascience pipelines using mlflow and airflow☆53Updated last year
- 🐍💨 Airflow tutorial for PyCon 2019☆85Updated last year
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆35Updated 2 months ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆104Updated last week
- Use Airflow to move data from multiple MySQL databases to BigQuery☆99Updated 4 years ago
- ☆20Updated 3 years ago
- ☆195Updated 11 months ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 5 years ago
- Learn the pyspark API through pictures and simple examples☆168Updated 3 years ago
- ☆32Updated 6 months ago
- Code for my blogs on Data Engineering☆13Updated 3 years ago
- Workshop for Spark and Databricks☆54Updated 4 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆47Updated last year
- Tutorial like code for how to deploy airflow using docker and how to use the DockerOperator.☆44Updated 4 years ago
- This Repository contains the material for my tutorial "Managing the end-to-end machine learning lifecycle with MLFlow" at pyData/pyCon Be…☆39Updated last year