polakowo / yelp-3nf
3NF-normalize Yelp data on S3 with Spark and load it into Redshift - automate the whole thing with Apache Airflow
☆12Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for yelp-3nf
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 4 months ago
- ☆16Updated last year
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆31Updated 3 years ago
- Analyzing and calculating key marketing metrics with SQL and Python☆14Updated 5 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Cloned by the `dbt init` task☆59Updated 6 months ago
- customer lifetime value BG/NBD model☆17Updated 2 years ago
- Snowflake Cookbook, published by Packt☆73Updated last year
- #DataPipeLine #ETL - Created is a Facebook data extraction utility to extract the publicly available data on Facebook. Used Facebook Grap…☆14Updated 6 years ago
- Analytics for building Customer Journey Map in Ecommerce☆28Updated 4 years ago
- Fivetran data models for QuickBooks using dbt.☆27Updated this week
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆24Updated last year
- E-Commerce Website A/B testing: Recommend which of two landing pages to keep based on A/B testing☆24Updated 6 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆104Updated this week
- Build your feature store with macros right within your dbt repository☆37Updated last year
- Mastering Spark for Data Science, published by Packt☆46Updated last year
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 5 years ago
- Spark NLP for Streamlit☆15Updated 3 years ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆14Updated 3 months ago
- ☆19Updated 3 years ago
- Predict the poverty of households in Costa Rica using automated feature engineering.☆23Updated 4 years ago
- Example Multi-Cycle, Multi-Touch Revenue and Cost Attribution Model☆20Updated 9 months ago
- ☆26Updated 5 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆26Updated 2 years ago
- AWS Big Data Certification☆25Updated last year
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- Predicting the Likelihood to Purchase a Financial Product Following a Direct Marketing Campaign☆28Updated last year
- A minimum viable setup for dbt with environment variables.☆16Updated 6 years ago