monte-carlo-data / data-downtime-challenge
☆84Updated 2 years ago
Alternatives and similar repositories for data-downtime-challenge:
Users that are interested in data-downtime-challenge are comparing it to the libraries listed below
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 8 months ago
- ☆27Updated 2 years ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 8 months ago
- Essential PySpark for Scalable Data Analytics, published by Packt☆44Updated 2 years ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆85Updated 4 years ago
- Example repo to kickstart integration with mlflow pipelines.☆76Updated 2 years ago
- Data Engineering with Spark and Delta Lake☆97Updated 2 years ago
- Snowflake Cookbook, published by Packt☆79Updated 2 years ago
- Batch Processing , orchestration using Apache Airflow and Google Workflows, spark structured Streaming and a lot more☆19Updated 2 years ago
- ⭕️ Data Engineering for Data Scientists☆77Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆50Updated 4 years ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- ☆181Updated 4 years ago
- ☆36Updated 2 years ago
- Intro to Generative AI with Snowflake☆38Updated 3 months ago
- An example MLFlow project☆48Updated 3 months ago
- ☆87Updated 2 years ago
- An example of an ETL pipeline that lays out generic DE processes. This is now out of date but still provides useful information☆26Updated 2 years ago
- Scaling Machine Learning in Three Week course in a collaboration with O'Reilly following the guidance of Adi Polak's book - Scaling Machi…☆23Updated last year
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆81Updated last year
- Public source code for the Udemy online course Apache Airflow: Complete Hands-On Beginner to Advanced Class.☆63Updated 4 years ago
- ☆39Updated 3 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- Optimizing Databricks Workload, published by Packt☆17Updated 2 years ago
- ☆106Updated 2 years ago
- Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database☆13Updated 3 years ago
- Quickstart: Getting Started with Snowpark Python☆31Updated 2 years ago
- Just starting your DE journey or along the way already?. I will be sharing a short list of DATA-ENGINEERING-CENTRED books that covers the…☆34Updated 2 years ago
- Code repository for the "PySpark in Action" book☆196Updated 2 years ago
- ☆20Updated 5 years ago