tfayyaz / cloud-dataproc
Cloud Dataproc: Samples and Utils
☆11Updated 4 years ago
Alternatives and similar repositories for cloud-dataproc:
Users that are interested in cloud-dataproc are comparing it to the libraries listed below
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆64Updated 9 months ago
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- ☆127Updated 9 months ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 3 weeks ago
- Dataproc templates and pipelines for solving simple in-cloud data tasks☆123Updated last week
- Building Big Data Pipelines with Apache Beam, published by Packt☆85Updated last year
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆49Updated last month
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆90Updated 6 months ago
- ☆132Updated 3 months ago
- Source code for the YouTube video, Apache Beam Explained in 12 Minutes☆20Updated 4 years ago
- ☆46Updated 9 months ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Updated 2 years ago
- ☆43Updated 4 years ago
- Interactive Notebooks that support the book☆39Updated 4 years ago
- ☆11Updated last year
- ☆59Updated last month
- Rules based grant management for Snowflake☆40Updated 6 years ago
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆31Updated 3 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated last year
- ☆27Updated 4 years ago
- This repository shows a sample example to build, manage and orchestrate Machine Learning workflows using Amazon Sagemaker and Apache Airf…☆136Updated 3 years ago
- ☆23Updated 2 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- ☆33Updated 2 months ago
- Cloud Dataproc: Samples and Utils☆200Updated last month
- Materials for the next course☆24Updated 2 years ago
- ☆33Updated 8 months ago
- Data Engineering with Spark and Delta Lake☆95Updated 2 years ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆81Updated last year