moj-analytical-services / dataengineeringutils3
Fully unit tested utility functions for data engineering. Python 3 only.
☆14Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for dataengineeringutils3
- CLI for data platform☆19Updated 11 months ago
- Make dbt great again! Enables end user to extend dbt to his/her needs☆13Updated this week
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆25Updated last year
- Build your feature store with macros right within your dbt repository☆37Updated last year
- Example Set up For DBT Cloud using Github Integrations☆11Updated 4 years ago
- Dask integration for Snowflake☆30Updated last week
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 2 months ago
- Getting Great Expectations setup to run on DataBricks with Spark Dataframes.☆12Updated 2 years ago
- AWS Quick Start Team☆18Updated last month
- dbt / Amazon Redshift Demonstration Project☆33Updated last year
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated last year
- dbt package for monitoring airflow DAGs and tasks☆29Updated this week
- A serverless duckDB deployment at GCP☆35Updated 2 years ago
- Fake Pandas / PySpark DataFrame creator☆42Updated 8 months ago
- An infrastructure as code approach to deploying Snowflake using Terraform☆24Updated last year
- ☆15Updated 3 months ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Activity Schema dbt package☆14Updated last year
- An experimental Athena extension for DuckDB 🐤☆50Updated 9 months ago
- ☆28Updated this week
- Spark app to merge different schemas☆23Updated 3 years ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆11Updated 5 months ago
- pyspark-parallelised functions producing graph-theoretical metrics in connected component clusters for use in record-linkage (or other do…☆10Updated last year
- ☆23Updated 5 months ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆10Updated this week
- 🐋 Docker image for AWS Glue Spark/Python☆22Updated last year
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆48Updated last year
- Utility functions for dbt projects running on Spark☆31Updated last year