dunnhumby / democratizing-dataprocLinks
Using terraform, deploy multiple dataproc clusters using a shared hive metastore
☆15Updated 3 years ago
Alternatives and similar repositories for democratizing-dataproc
Users that are interested in democratizing-dataproc are comparing it to the libraries listed below
Sorting:
- Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow☆30Updated 8 years ago
- Tools for creating Dataproc custom images☆34Updated 2 weeks ago
- ☆54Updated 7 years ago
- ☆46Updated last year
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Tools to deploy Hadoop on EMC Isilon☆17Updated 8 years ago
- Airflow workflow management platform chef cookbook.☆71Updated 6 years ago
- Snippets of code used in blog posts and other media.☆13Updated this week
- Repo for the Stitch Docs☆57Updated 2 weeks ago
- Data Catalog Tag Templates☆30Updated 2 months ago
- Opinion Analysis of News, Threaded Conversations, and User Generated Content☆103Updated 9 months ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆20Updated this week
- A toolset to streamline running spark python on EMR☆20Updated 8 years ago
- Puppet module to provision Airbnb's Airflow☆19Updated 3 years ago
- Terraform script for launching multiple EMR clusters for training purposes.☆16Updated last year
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆93Updated 11 months ago
- This is the support code and solutions for the NYC Taxi Tycoon Dataflow Codelab☆61Updated 5 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- An extension for Jupyter notebooks that allows running notebooks inside a Docker container and converting them to runnable Docker images.