telia-oss / birgittaLinks
Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes.
☆14Updated 2 years ago
Alternatives and similar repositories for birgitta
Users that are interested in birgitta are comparing it to the libraries listed below
Sorting:
- A collection of python utility functions☆11Updated last month
- [under development] ETL materials to support proposal for CDM enhancements for clinical trial data☆24Updated 4 years ago
- Hephaestus - ETL and ML tools for OHDSI - OMOP CDM☆13Updated 3 months ago
- FHIR to OMOP using PySpark on AWS Glue☆14Updated 4 years ago
- Medium Article☆11Updated 4 years ago
- Simple samples for writing ETL transform scripts in Python☆24Updated last week
- The best Python package for comparing two dataframes☆11Updated 3 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 4 years ago
- ☆12Updated last month
- 🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎☆16Updated 3 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆36Updated 6 years ago
- Extension to Python-Markdown to translate pydantic's model fields to markdown table☆12Updated last year
- ☆11Updated 4 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆39Updated 3 years ago
- Parquet file management in S3 for Athena / Spectrum / Presto partitioning☆22Updated 10 months ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆18Updated 4 years ago
- Utility functions for dbt projects running on Spark☆34Updated this week
- Pandas helper functions☆31Updated 2 years ago
- Cohort extractor tool which can generate dummy data, or real data against OpenSAFELY-compliant research databases☆38Updated 5 months ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- ☆48Updated 2 years ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆111Updated 2 years ago
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆66Updated this week
- ☆10Updated 5 years ago
- Publication: Linked electronic health records for research on a nationwide cohort including over 54 million people in England☆19Updated 2 years ago
- Outcomes Insights' Data Model for Clinical Research☆19Updated 4 months ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- Build your feature store with macros right within your dbt repository☆39Updated 3 years ago
- Common Python tools and utilities for data engineering, ETL, Exploration, etc. made opensource and packaged, making it easy to use in any…☆13Updated last month
- ⚡️ Pandas dataframes with object oriented programming style (not maintained)☆11Updated last year