mozilla / gcp-ingestion
Documentation and implementation of telemetry ingestion on Google Cloud Platform
☆81Updated this week
Alternatives and similar repositories for gcp-ingestion:
Users that are interested in gcp-ingestion are comparing it to the libraries listed below
- Schemas for Mozilla's data ingestion pipeline and data lake outputs☆47Updated this week
- Airflow configuration for Telemetry☆185Updated this week
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆90Updated 5 months ago
- Bigquery ETL☆267Updated this week
- ☆46Updated 8 months ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 8 months ago
- Tag Engine automates the process of creating, updating, deleting, and populating metadata in bulk with the Google Cloud services Data Cat…☆51Updated 3 weeks ago
- ☆127Updated 8 months ago
- ☆63Updated this week
- Astronomer Core Docker Images☆106Updated 7 months ago
- Commons code used by the Data Catalog connectors, and links for the connectors sample code.☆61Updated 3 years ago
- Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.☆143Updated 7 months ago
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆155Updated this week
- Data ingestion library for Amundsen to build graph and search index☆205Updated 10 months ago
- Cloud Dataproc: Samples and Utils☆199Updated last week
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 3 weeks ago
- ETL jobs for Firefox Telemetry☆28Updated 4 months ago
- Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow☆42Updated this week
- A guide for Mozilla's developers and data scientists to analyze and interpret the data gathered by our data collection systems.☆87Updated this week
- ☆65Updated 5 months ago
- End-to-end DataOps platform deployed by Terraform.☆65Updated 6 months ago
- ☆43Updated 3 weeks ago
- ☆31Updated 6 years ago
- Sample code with integration between Data Catalog and RDBMS data sources.☆72Updated 3 years ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 2 years ago
- DBeam exports SQL tables into Avro files using JDBC and Apache Beam☆193Updated this week
- Utility to identify and rewrite common anti patterns in BigQuery SQL syntax☆86Updated 2 months ago
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆47Updated last week
- Creates opinionated BigQuery datasets and tables☆203Updated this week
- Dataproc templates and pipelines for solving simple in-cloud data tasks☆122Updated this week