dunnhumby / democratizing-dataprocLinks
Using terraform, deploy multiple dataproc clusters using a shared hive metastore
☆15Updated 3 years ago
Alternatives and similar repositories for democratizing-dataproc
Users that are interested in democratizing-dataproc are comparing it to the libraries listed below
Sorting:
- Airflow configuration for Telemetry☆193Updated last week
- Documentation and implementation of telemetry ingestion on Google Cloud Platform☆83Updated this week
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆95Updated last year
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆149Updated 8 years ago
- Example Kubernetes app that shows how to build a 'pipeline' to stream data into BigQuery. Uses Redis or Google Cloud PubSub☆129Updated 4 years ago
- ☆54Updated 8 years ago
- ☆46Updated last year
- Data models for snowplow analytics.☆129Updated 6 months ago
- Quickly get a kubernetes executor airflow environment provisioned on GKE. Azure Kubernetes Service instructions included also as are inst…☆36Updated 5 years ago
- The open source version of the Amazon Athena documentation. To submit feedback & requests for changes, submit issues in this repository, …☆83Updated 2 years ago
- ☆33Updated last year
- Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform☆260Updated 2 years ago
- Streaming data from Cloud Storage into BigQuery using Cloud Functions☆49Updated 4 years ago
- Cloud Build for Deploying Datapipelines with Composer, Dataflow and BigQuery☆64Updated 5 years ago
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆165Updated 3 weeks ago
- A Getting Started Guide for developing and using Airflow Plugins☆93Updated 6 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆111Updated 3 weeks ago
- Airflow workflow management platform chef cookbook.☆71Updated 6 years ago
- ☆130Updated last year
- Opinion Analysis of News, Threaded Conversations, and User Generated Content☆105Updated 11 months ago
- Metadata service library for Amundsen☆83Updated last month
- Sample code with integration between Data Catalog and RDBMS data sources.☆71Updated 3 years ago
- Creates opinionated BigQuery datasets and tables☆225Updated this week
- Ephemeral Hadoop clusters using Google Compute Platform☆136Updated 3 years ago
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 5 months ago
- Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow☆30Updated 8 years ago
- ☆20Updated 6 years ago
- Pylint plugin for static code analysis on Airflow code☆95Updated 4 years ago
- Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.☆285Updated this week
- This is the support code and solutions for the NYC Taxi Tycoon Dataflow Codelab☆61Updated 5 years ago