mozilla / docker-etl
Collection of dockerized ETL jobs managed by data engineering.
☆19Updated last week
Related projects ⓘ
Alternatives and complementary repositories for docker-etl
- LookML Generator for Glean and Mozilla Data☆17Updated this week
- ☆46Updated 6 months ago
- End-to-end DataOps platform deployed by Terraform.☆63Updated 4 months ago
- Apache Airflow CI pipeline☆18Updated 5 years ago
- ETL jobs for Firefox Telemetry☆27Updated last month
- Astronomer Core Docker Images☆106Updated 5 months ago
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-dataproc☆48Updated last year
- Utility functions for dbt projects running on Spark☆31Updated last year
- Unity Catalog UI☆39Updated 2 months ago
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-iam☆37Updated last year
- A new Airflow Provider for Fivetran, maintained by Astronomer and Fivetran☆20Updated 2 weeks ago
- Delta reader for the Ray open-source toolkit for building ML applications☆42Updated 9 months ago
- Extension dtypes for pandas corresponding to GoogleSQL data types such as DATE, TIME, and JSON.☆26Updated this week
- Schemas for Mozilla's data ingestion pipeline and data lake outputs☆46Updated this week
- Sample Airflow DAGs☆61Updated last year
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆46Updated 2 weeks ago
- a pytest plugin for dbt adapter test suites☆19Updated last year
- Pipeline definitions for managing data flows to power analytics at MIT Open Learning☆37Updated this week
- Pylint plugin for static code analysis on Airflow code☆90Updated 4 years ago
- ☆24Updated 4 years ago
- A DataOps framework for building a lakehouse.☆30Updated this week
- Commons code used by the Data Catalog connectors, and links for the connectors sample code.☆61Updated 2 years ago
- CloudEvent Types for Python☆26Updated last week
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆63Updated 6 months ago
- Rules based grant management for Snowflake☆40Updated 5 years ago
- Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested da…☆110Updated last year
- Airflow configuration for Telemetry☆182Updated this week
- Apache Airflow https://airflow.apache.org☆41Updated this week
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated 10 months ago