tfayyaz / cloud-dataproc
Cloud Dataproc: Samples and Utils
☆11Updated 4 years ago
Alternatives and similar repositories for cloud-dataproc:
Users that are interested in cloud-dataproc are comparing it to the libraries listed below
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆67Updated 10 months ago
- ☆128Updated 11 months ago
- Dataproc templates and pipelines for solving simple in-cloud data tasks☆126Updated 3 weeks ago
- ☆11Updated last year
- ☆25Updated 4 years ago
- Demo assets for DAIS 2021 'Learn to use Databricks for the full ML lifecycle' Talk☆13Updated 3 years ago
- ☆47Updated 10 months ago
- ☆36Updated 2 years ago
- This repository contains code for Spark Streaming☆21Updated 4 years ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Updated 2 years ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆81Updated last year
- GCP-Data-Engineer-Study-Guide☆120Updated 5 years ago
- ☆175Updated last week
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- ☆134Updated 4 months ago
- Sample Airflow DAGs☆62Updated 2 years ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 2 months ago
- Full stack data engineering tools and infrastructure set-up☆50Updated 4 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated last year
- Collection of Machine Learning Examples for Azure Databricks☆41Updated 4 years ago
- Source code for the YouTube video, Apache Beam Explained in 12 Minutes☆21Updated 4 years ago
- ☆60Updated this week
- Data Catalog Tag Templates☆30Updated 5 months ago
- Building Big Data Pipelines with Apache Beam, published by Packt☆86Updated 2 years ago
- ☆17Updated last week
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆44Updated 2 months ago
- ☆43Updated last year
- ☆23Updated 2 years ago
- Source code for 'BigQuery for Data Warehousing' by Mark Mucchetti☆16Updated 4 years ago
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆158Updated last month