atlanhq / airflow_blog
Code that goes along with https://humansofdata.atlan.com/2018/06/apache-airflow-disease-outbreaks-india/
☆24Updated last year
Alternatives and similar repositories for airflow_blog:
Users that are interested in airflow_blog are comparing it to the libraries listed below
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- Big Data Demystified meetup and blog examples☆31Updated 5 months ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- ☆16Updated 7 years ago
- Python library for efficient multi-threaded data processing, with the support for out-of-memory datasets.☆27Updated 5 years ago
- Techniques for Scraping the Web in Python☆25Updated 6 years ago
- How to do data science with Optimus, Spark and Python.☆19Updated 5 years ago
- Basic tutorial of using Apache Airflow☆36Updated 6 years ago
- Material for Talk Python Training course on Getting Started with Dask.☆28Updated 2 years ago
- Building an API with the FastAPI framework to serve a scikit-learn model.☆18Updated 6 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- Business Data Analysis by HiPIC of CalStateLA☆20Updated 6 years ago
- Project template for highly effective data science workflows☆29Updated 9 months ago
- Cloudformation template for deploying Presto on AWS☆13Updated 4 years ago
- Work for Mastering Large Datasets with Python☆18Updated 2 years ago
- ☆54Updated 6 years ago
- Jupyter notebooks for learning Python and Data Science, companion to Data Science Solutions book.☆36Updated 4 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆46Updated last year
- JupyterCon Missing Data Talk 2018☆23Updated 6 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apach…☆19Updated 8 years ago
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- ☆25Updated 6 years ago
- pyspark sample scripts☆17Updated 6 years ago
- Udacity Data Pipeline Exercises☆15Updated 4 years ago
- Simple alert system implemented in Kafka and Python☆95Updated 6 years ago
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 5 years ago