dushyantkhosla / airflow4ds
Using Apache Airflow to author, run and monitor complex data pipelines.
☆12Updated 6 years ago
Alternatives and similar repositories for airflow4ds:
Users that are interested in airflow4ds are comparing it to the libraries listed below
- Repository used for Spark Trainings☆53Updated last year
- Airflow training for the crunch conf☆105Updated 6 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 5 years ago
- PySpark Code for Hands-on Learners☆116Updated 5 years ago
- MLFlow Spark Summit 2019 Presentation☆67Updated 5 years ago
- ☆33Updated last year
- (project & tutorial) dag pipeline tests + ci/cd setup☆86Updated 4 years ago
- ☆84Updated 2 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- A four-day course on Python, the Scientific Python stack and PySpark, adapted from a training course I gave to one of our clients in Dece…☆10Updated 9 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Updated 5 years ago
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 5 years ago
- Updated repository☆157Updated 3 years ago
- Here's how to get DataQuest's Data Engineering Track missions' content to work on your localhost. Using data from my Valenbisi ARIMA mode…☆15Updated 6 years ago
- ☆198Updated last year
- A repository for a PySpark Cookbook by Tomasz Drabas and Denny Lee☆60Updated 6 years ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Updated 3 years ago
- A simple Spark TDD example☆26Updated 7 years ago
- Code to build a simple analytics data pipeline with Python☆102Updated 8 years ago
- ☆16Updated 7 years ago
- code, labs and lectures for the course☆46Updated last year
- My presentation at ODSC India 2018 about Deep Learning with Apache Spark☆27Updated 6 years ago
- MLflow samples - deprecated☆22Updated last year
- Various data stream/batch process demo with Apache Scala Spark 🚀☆11Updated 5 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Learn the pyspark API through pictures and simple examples☆170Updated 4 years ago
- ☆25Updated 6 years ago
- Use Airflow to move data from multiple MySQL databases to BigQuery☆100Updated 4 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆103Updated 5 years ago