telia-oss / birgittaLinks
Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes.
☆14Updated 2 years ago
Alternatives and similar repositories for birgitta
Users that are interested in birgitta are comparing it to the libraries listed below
Sorting:
- ☆11Updated 4 years ago
- ☆12Updated 3 months ago
- Medium Article☆11Updated 4 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆66Updated this week
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year
- [under development] ETL materials to support proposal for CDM enhancements for clinical trial data☆24Updated 4 years ago
- FHIR to OMOP using PySpark on AWS Glue☆14Updated 4 years ago
- Simple samples for writing ETL transform scripts in Python☆24Updated last week
- A collection of python utility functions☆11Updated last week
- Hephaestus - ETL and ML tools for OHDSI - OMOP CDM☆13Updated 4 months ago
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆69Updated 3 weeks ago
- The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)☆27Updated 4 years ago
- Extension to Python-Markdown to translate pydantic's model fields to markdown table☆12Updated last year
- Build your feature store with macros right within your dbt repository☆39Updated 3 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆36Updated 6 years ago
- Dask integration for Snowflake☆30Updated 5 months ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆83Updated last year
- Pandas helper functions☆31Updated 2 years ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆110Updated 2 years ago
- The best Python package for comparing two dataframes☆11Updated 4 years ago
- Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0☆45Updated 3 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆39Updated 3 years ago
- ☆58Updated last month
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.☆81Updated last week
- A repository containing an introduction to Panel made to be support videos and talks.☆56Updated 4 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 4 years ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆18Updated 4 years ago
- A collection of Pandas helper functions.☆14Updated 2 years ago
- Omnipy is a high level Python library for type-driven data wrangling and scalable workflow orchestration (under development)☆25Updated 3 weeks ago