tfayyaz / cloud-dataprocLinks
Cloud Dataproc: Samples and Utils
☆11Updated 5 years ago
Alternatives and similar repositories for cloud-dataproc
Users that are interested in cloud-dataproc are comparing it to the libraries listed below
Sorting:
- ☆130Updated last year
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆72Updated last year
- Demo assets for DAIS 2021 'Learn to use Databricks for the full ML lifecycle' Talk☆14Updated 3 years ago
- Dataproc templates and pipelines for solving in-cloud data tasks☆132Updated last week
- Sample code with integration between Data Catalog and RDBMS data sources.☆72Updated 3 years ago
- ☆143Updated 10 months ago
- This repo contains live examples to build Databricks' Lakehouse and recommended best practices from the field.☆21Updated 11 months ago
- The go to demo for public and private dbt Learn☆80Updated 5 months ago
- Data Engineering with Spark and Delta Lake☆104Updated 2 years ago
- An end to end demo of Google's Cloud data and analytic stack.☆267Updated this week
- Repository for Beam College sessions☆110Updated 4 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆30Updated 2 years ago
- ☆42Updated 5 years ago
- ☆95Updated 8 months ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆182Updated 3 years ago
- ☆36Updated 3 years ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 3 years ago
- ☆88Updated 2 years ago
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆165Updated last month
- Exercises, source files and anything else related to Snowflake Data Engineering Bootcamp course on O'Reilly learning platform.☆31Updated 4 months ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Updated 2 years ago
- Cloned by the `dbt init` task☆62Updated last year
- Source code for 'BigQuery for Data Warehousing' by Mark Mucchetti☆16Updated 4 years ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆83Updated last year
- markup to create labs for courses from the Google Cloud training catalog.☆49Updated last week
- Building Big Data Pipelines with Apache Beam, published by Packt☆86Updated 2 years ago
- Interactive Notebooks that support the book☆40Updated 4 years ago
- ☆24Updated 2 years ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago
- Source code for the YouTube video, Apache Beam Explained in 12 Minutes☆21Updated 4 years ago