A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.
☆139Apr 18, 2020Updated 6 years ago
Alternatives and similar repositories for Skytrax-Data-Warehouse
Users that are interested in Skytrax-Data-Warehouse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A way for home buyers to know about factors affecting a state☆48Mar 2, 2019Updated 7 years ago
- Tracking and measuring neighborhood and district-level eviction rates in the city of San Francisco.☆141Jul 14, 2020Updated 5 years ago
- Airflow ETL for Meetup API☆45Dec 27, 2018Updated 7 years ago
- Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and tr…☆13May 25, 2023Updated 3 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆15Jun 26, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Pyspark Spotify ETL☆17Aug 19, 2021Updated 4 years ago
- RedditR for Content Engagement and Recommendation☆18Dec 21, 2017Updated 8 years ago
- Beginner data engineering project - batch edition☆582Apr 13, 2026Updated last month
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Nov 22, 2021Updated 4 years ago
- Tough and flexible tools for data analysis, transformation, validation and movement.☆142Jan 26, 2024Updated 2 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆51Aug 23, 2019Updated 6 years ago
- This checklist aims to be an exhaustive list of all elements you should consider when using Amazon Redshift.☆15Sep 21, 2020Updated 5 years ago
- Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake developme…☆1,908Aug 26, 2022Updated 3 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.☆1,510Mar 9, 2020Updated 6 years ago
- Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool☆14Dec 12, 2025Updated 5 months ago
- An end-to-end data engineering pipeline to create a dashboard for the latest content on the r/Stocks subreddit☆20Aug 5, 2022Updated 3 years ago
- Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database☆14Oct 26, 2021Updated 4 years ago
- A platform-agnostic index of Singer.io taps and targets.☆11Jan 29, 2021Updated 5 years ago
- Example end to end data engineering project.☆1,409Dec 8, 2022Updated 3 years ago
- A template for dockerized dbt-Core projects with VS Code Dev Containers.☆21Nov 14, 2022Updated 3 years ago
- ☆196Feb 25, 2022Updated 4 years ago
- My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggrega…☆506Aug 24, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- pyspark methods to enhance developer productivity 📣 👯 🎉☆687Mar 6, 2025Updated last year
- leverage predictive models, machine learning, data mining to solve marketing, business problems☆14Feb 23, 2017Updated 9 years ago
- For over a year now, everything about my professional life has been around Google Cloud. This repo is a repercussion of my disastrous Goo…☆12Nov 17, 2019Updated 6 years ago
- Spark, Airflow, Kafka☆24Apr 30, 2023Updated 3 years ago
- Personal Data Engineering Projects☆1,013Feb 8, 2023Updated 3 years ago
- My Portfolio of all the projects I did for both my Udacity Data Engineer and Data Streaming Nanodegrees☆21Jul 16, 2020Updated 5 years ago
- Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…☆57Oct 20, 2022Updated 3 years ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 3 years ago
- Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.☆350Jan 12, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- R package for tracking Covid19 cases in San Francisco☆12Apr 2, 2023Updated 3 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆899May 8, 2022Updated 4 years ago
- Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested da…☆114Sep 21, 2023Updated 2 years ago
- ☆13Jan 7, 2022Updated 4 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆17Dec 18, 2018Updated 7 years ago
- Data pipeline that scrapes Rust cheater Steam profiles☆54Feb 13, 2022Updated 4 years ago
- A list of useful resources to learn Data Engineering from scratch☆3,996Jun 19, 2024Updated last year