mozilla / gcp-ingestion
Documentation and implementation of telemetry ingestion on Google Cloud Platform
☆82Updated this week
Alternatives and similar repositories for gcp-ingestion:
Users that are interested in gcp-ingestion are comparing it to the libraries listed below
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆91Updated 7 months ago
- ☆128Updated 11 months ago
- ☆47Updated 10 months ago
- Schemas for Mozilla's data ingestion pipeline and data lake outputs☆47Updated this week
- ☆66Updated 7 months ago
- Airflow configuration for Telemetry☆185Updated this week
- Bigquery ETL☆292Updated this week
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆158Updated last month
- Oozie Workflow to Airflow DAGs migration tool☆88Updated 2 weeks ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated last month
- Open source tools for Google Cloud Storage and Databases.☆63Updated 10 months ago
- Commons code used by the Data Catalog connectors, and links for the connectors sample code.☆61Updated 3 years ago
- DBeam exports SQL tables into Avro files using JDBC and Apache Beam☆195Updated this week
- Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.☆142Updated 9 months ago
- Database plugins☆14Updated this week
- Cloud Dataproc: Samples and Utils☆201Updated 2 months ago
- ☆31Updated 6 years ago
- Cask Hydrator Plugins Repository☆68Updated this week
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago
- A curated list of awesome resources for Apache Beam☆146Updated 2 years ago
- Astronomer Core Docker Images☆106Updated 10 months ago
- Sample code with integration between Data Catalog and RDBMS data sources.☆72Updated 3 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- Tag Engine automates the process of creating, updating, deleting, and populating metadata in bulk with the Google Cloud services Data Cat…☆53Updated last week
- A collection of Google Cloud Platform (GCP) plugins☆45Updated this week
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆53Updated this week
- Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow☆43Updated 2 weeks ago
- ☆65Updated this week
- ☆28Updated 10 months ago
- Advertising Data Lakes and Workflow Automation☆50Updated 4 years ago