mozilla / gcp-ingestion
Documentation and implementation of telemetry ingestion on Google Cloud Platform
☆79Updated this week
Related projects ⓘ
Alternatives and complementary repositories for gcp-ingestion
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆89Updated 2 months ago
- Bigquery ETL☆258Updated this week
- Airflow configuration for Telemetry☆182Updated this week
- Schemas for Mozilla's data ingestion pipeline and data lake outputs☆46Updated this week
- Tag Engine automates the process of creating, updating, deleting, and populating metadata in bulk with the Google Cloud services Data Cat…☆49Updated 2 weeks ago
- ☆126Updated 6 months ago
- Data Quality Engine for BigQuery☆258Updated 3 months ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 2 years ago
- ☆46Updated 6 months ago
- Utility to identify and rewrite common anti patterns in BigQuery SQL syntax☆83Updated this week
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 8 months ago
- Commons code used by the Data Catalog connectors, and links for the connectors sample code.☆61Updated 2 years ago
- Dataproc templates and pipelines for solving simple in-cloud data tasks☆118Updated this week
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆150Updated this week
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 6 months ago
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 2 weeks ago
- A curated list of awesome resources for Apache Beam☆146Updated last year
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆46Updated 2 weeks ago
- ☆31Updated 6 years ago
- Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.☆142Updated 5 months ago
- LookML Generator for Glean and Mozilla Data☆17Updated this week
- DBeam exports SQL tables into Avro files using JDBC and Apache Beam☆191Updated this week
- ☆64Updated 2 months ago
- Snowflake Data Source for Apache Spark.☆217Updated this week
- End-to-end DataOps platform deployed by Terraform.☆63Updated 4 months ago
- Ephemeral Hadoop clusters using Google Compute Platform☆134Updated 2 years ago
- Apache Airflow CI pipeline☆18Updated 5 years ago
- Data Catalog Tag Templates☆29Updated 3 weeks ago
- Data ingestion library for Amundsen to build graph and search index☆206Updated 7 months ago
- Database plugins☆14Updated this week