r-kells / prod-airflow
Helping you get Airflow running in production.
☆9Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for prod-airflow
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- ☆26Updated 10 months ago
- ☆18Updated 3 years ago
- Udacity Data Pipeline Exercises☆15Updated 4 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆83Updated 5 years ago
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- Predict the poverty of households in Costa Rica using automated feature engineering.☆23Updated 4 years ago
- ElasticSearch implementation of MlFlow tracking store☆16Updated 4 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- code, labs and lectures for the course☆45Updated last year
- ☆19Updated 3 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆46Updated last year
- Data Science and Machine Learning with Python - Hands On from Udemy☆14Updated 7 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Updated 5 years ago
- This is a machine learning challenge conducted by C&D Labs and Future Group in association with HackerEarth.☆10Updated 6 years ago
- Best practices for engineering ML pipelines.☆37Updated 2 years ago
- ☆10Updated 6 years ago
- An example PySpark project with pytest☆17Updated 7 years ago
- Repo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializi…☆32Updated 5 years ago
- Simple template showing how to set up docker for reproducible data science with Jupyter notebooks.☆22Updated 4 months ago
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Updated 2 years ago
- REST API (and possible UI) for Machine Learning workflows☆63Updated 5 years ago
- scaffold of Apache Airflow executing Docker containers☆85Updated last year
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆52Updated 8 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆25Updated 2 years ago
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆14Updated 3 years ago