ultranet1 / APACHE_AIRFLOW_DATA_PIPELINESLinks
Project Description: A music streaming company wants to introduce more automation and monitoring to their data warehouse ETL pipelines and they have come to the conclusion that the best tool to achieve this is Apache Airflow. As their Data Engineer, I was tasked to create a reusable production-grade data pipeline that incorporates data quality c…
☆17Updated 4 years ago
Alternatives and similar repositories for APACHE_AIRFLOW_DATA_PIPELINES
Users that are interested in APACHE_AIRFLOW_DATA_PIPELINES are comparing it to the libraries listed below
Sorting:
- Create Interactive Dashboards with Streamlit and Python Coursera☆10Updated 4 years ago
- This repository houses all the resources, contents, source codes, files, jupyter notebooks etc. related to Natural Language Processing re…☆10Updated 5 years ago
- Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple …☆30Updated 4 years ago
- ☆21Updated 2 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆16Updated 5 years ago
- Use Multiple Linear Regression, Python, Pandas, and Matplotlib to analyze the lifetime value and the key factors of the ‘Telco Customer C…☆10Updated 5 years ago
- Fraud detection in credit card payments and auto insurance claims using PySpark☆13Updated 6 years ago
- Spark app to merge different schemas☆23Updated 4 years ago
- Money Laundering Detector is to prove the hypothesis that a solution powered by Machine Learning and Behaviour Analytics will find… -> cu…☆21Updated 7 years ago
- Looking for factors indicating fraud using insurance claims data.☆10Updated 6 years ago
- Automate PowerPoint Slides Creation with Python☆33Updated 4 months ago
- Amazon Redshift Cookbook, Published by Packt☆15Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆16Updated 6 years ago
- ☆16Updated 3 years ago
- Natural Language Processing project in tweets labeled by Myers Briggs Personality Type☆16Updated 6 years ago
- Insurance Claim Prediction using Machine Learning - Udacity Nanodegree Capstone Project☆16Updated 8 years ago
- Streamlit example showing Scikit Learn & Pyspark ML over Healthcare data ! Its simple !!☆30Updated 4 years ago
- Data Analysis with Python - Customer Segmentation ( RFM Analysis) - Power BI Dashboard - Tableau Dashboard☆10Updated 4 years ago
- Learning and buiding API using Fast API☆16Updated 3 years ago
- Source code for 'Building a Data Warehouse' by Vincent Rainardi☆30Updated 8 years ago
- This repo implements a GUI for Chatting with your PDF files using PaLM embedding and LLM via API.☆26Updated last year
- Azure Data Engineering Cookbook 2nd-edition, published by Packt☆32Updated last year
- Code repo for Packt course I developed, "Beginning Data Wrangling with Python"☆30Updated 5 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆25Updated 4 years ago
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆31Updated last year
- All my Samples for 2021☆21Updated 3 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 5 years ago
- Infuse AI into your application. Create and deploy a customer churn prediction model with IBM Cloud Private for Data, Db2 Warehouse, Spar…☆17Updated last month
- ETL process which downloads, transforms, and loads Freddie Mac/Fannie Mae mortgage data☆19Updated 7 years ago
- Exploring GPT-3☆31Updated 3 years ago