mozilla / docker-etl
Collection of dockerized ETL jobs managed by data engineering.
☆19Updated last week
Related projects ⓘ
Alternatives and complementary repositories for docker-etl
- ETL jobs for Firefox Telemetry☆27Updated 2 months ago
- Utility functions for dbt projects running on Spark☆31Updated last year
- Documentation and implementation of telemetry ingestion on Google Cloud Platform☆79Updated this week
- LookML Generator for Glean and Mozilla Data☆17Updated this week
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-dataproc☆48Updated last year
- Example code for doing DataOps☆46Updated 3 years ago
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆46Updated 3 weeks ago
- Fake Pandas / PySpark DataFrame creator☆42Updated 8 months ago
- ☆20Updated 3 years ago
- dbt adapter for Azure Synapse Dedicated SQL Pools☆70Updated this week
- A Table format agnostic data sharing framework☆38Updated 9 months ago
- Delta Lake Documentation☆46Updated 5 months ago
- A curated list of awesome Databricks resources, including Spark☆14Updated 4 months ago
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-iam☆37Updated last year
- ☆46Updated 6 months ago
- Enforce Best Practices for all your Airflow DAGs. ⭐☆92Updated this week
- dbt package for monitoring airflow DAGs and tasks☆29Updated this week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆60Updated last year
- Make simple storing test results and visualisation of these in a BI dashboard☆40Updated last week
- Astronomer Core Docker Images☆106Updated 5 months ago
- A Python API for Asynchronously Loading Data into Snowflake DB -☆60Updated this week
- Tag Engine automates the process of creating, updating, deleting, and populating metadata in bulk with the Google Cloud services Data Cat…☆49Updated last month
- Full stack data engineering tools and infrastructure set-up☆44Updated 3 years ago
- New generation opensource data stack☆61Updated 2 years ago
- A bunch of hacks developed around dbt☆48Updated 5 years ago
- ☆25Updated this week
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆21Updated last month
- Pytest plugin for dbt core☆58Updated 5 months ago
- A repository of sample code to show data quality checking best practices using Airflow.☆72Updated last year