telia-oss / birgittaLinks
Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes.
☆14Updated 2 years ago
Alternatives and similar repositories for birgitta
Users that are interested in birgitta are comparing it to the libraries listed below
Sorting:
- Medium Article☆11Updated 4 years ago
- [under development] ETL materials to support proposal for CDM enhancements for clinical trial data☆24Updated 4 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆65Updated last week
- ☆11Updated 4 years ago
- Simple samples for writing ETL transform scripts in Python☆24Updated 3 weeks ago
- FHIR to OMOP using PySpark on AWS Glue☆14Updated 4 years ago
- A collection of python utility functions☆11Updated 2 months ago
- Hephaestus - ETL and ML tools for OHDSI - OMOP CDM☆13Updated 3 months ago
- Python package for managing OHDSI clinical data models. Includes support for LLM based plain text queries, MCP server and FHIR import.☆56Updated 2 weeks ago
- ☆12Updated 2 months ago
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year
- Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0☆45Updated 3 years ago
- Extension to Python-Markdown to translate pydantic's model fields to markdown table☆12Updated last year
- The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)☆26Updated 4 years ago
- A collection of Pandas helper functions.☆14Updated 2 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 4 years ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆111Updated 2 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆36Updated 6 years ago
- 🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎☆16Updated 3 years ago
- Dask integration for Snowflake☆30Updated 5 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆37Updated 3 weeks ago
- Check the basic quality of any dataset☆12Updated 4 years ago
- The best Python package for comparing two dataframes☆11Updated 4 years ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 4 years ago
- ☆31Updated 2 years ago
- Parquet file management in S3 for Athena / Spectrum / Presto partitioning☆22Updated 11 months ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆68Updated 3 weeks ago
- Pandas helper functions☆31Updated 2 years ago
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Updated 6 years ago