dunnhumby / democratizing-dataproc
Using terraform, deploy multiple dataproc clusters using a shared hive metastore
☆15Updated 2 years ago
Alternatives and similar repositories for democratizing-dataproc:
Users that are interested in democratizing-dataproc are comparing it to the libraries listed below
- GCP Plugin for Gordon: Event-driven Cloud DNS☆12Updated last year
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-dataproc☆48Updated last year
- Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow☆30Updated 7 years ago
- Tools for creating Dataproc custom images☆32Updated this week
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆90Updated 6 months ago
- ☆54Updated 7 years ago
- Repo for the Stitch Docs☆56Updated this week
- Cloudera Director sample code☆61Updated 5 years ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago
- Tools for working with Singer Taps and Targets☆59Updated 5 months ago
- Example Kubernetes app that shows how to build a 'pipeline' to stream data into BigQuery. Uses Redis or Google Cloud PubSub☆129Updated 4 years ago
- ☆46Updated 9 months ago
- Database plugins☆14Updated this week
- Cloud Build for Deploying Datapipelines with Composer, Dataflow and BigQuery☆64Updated 4 years ago
- An open source library for BigQuery testing.☆14Updated 2 years ago
- ☆20Updated 5 years ago
- Dependency Management Toolkit for Google Cloud Python Projects☆44Updated 2 years ago
- ☆33Updated 10 months ago
- Utilities to help HBase as a service in HDInsight Azure☆14Updated last year
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Uses Cloud Build to deploy a scalable batch ingestion pipeline consisting of GCS, Cloud Functions, Dataflow and BigQuery☆22Updated 2 years ago
- Creates opinionated BigQuery datasets and tables☆205Updated last week
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-dlp☆43Updated last year
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Styles for dbt on the net☆9Updated 2 months ago
- Lightweight configuration and access to multiple databases in a single project☆38Updated last year
- Data models for snowplow analytics.☆127Updated last week
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆49Updated last month
- Data Catalog Tag Templates☆30Updated 4 months ago
- Replicates data between Google Cloud BigQuery projects☆21Updated 8 years ago