telia-oss / birgitta
Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes.
☆14Updated last year
Alternatives and similar repositories for birgitta:
Users that are interested in birgitta are comparing it to the libraries listed below
- ☆13Updated 3 weeks ago
- Hephaestus - ETL and ML tools for OHDSI - OMOP CDM☆13Updated 2 years ago
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆44Updated 9 months ago
- Utility functions for dbt projects running on Spark☆32Updated 2 months ago
- ☆10Updated 3 years ago
- Using the Parquet file format with Python☆15Updated last year
- Astronomer Vendor Images☆14Updated this week
- ☆9Updated 2 months ago
- ☆11Updated 5 months ago
- ☆12Updated last year
- 🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎☆16Updated 2 years ago
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- A collection of python utility functions☆11Updated 9 months ago
- ☆14Updated 4 years ago
- Cloud-agnostic Python API☆60Updated 10 months ago
- Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0☆45Updated 2 years ago
- Awesome List for Data Operations☆24Updated 4 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- Pocket data flows orchestrated using Prefect☆45Updated last month
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆57Updated last week
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆21Updated 2 years ago
- This connector is a dbt project that maps Medicare CCLF claims data to the Tuva Input Layer.☆13Updated last month
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated 11 months ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆14Updated last week
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 3 months ago
- Awesome Orchest projects, both official and submitted by the community.☆25Updated last year
- Full stack data engineering tools and infrastructure set-up☆51Updated 4 years ago
- Function decorators for Pandas Dataframe column name and data type validation☆17Updated last month
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆32Updated 3 years ago
- CLI for data platform☆19Updated last year