Clearbox-AI / StructuredDataProfiling
A Python library to check for data quality and automatically generate data tests.
☆42Updated last year
Related projects ⓘ
Alternatives and complementary repositories for StructuredDataProfiling
- A Python library to perform NER on structured data and generate PII with Faker☆27Updated 5 months ago
- Python Biella Group basic template for a modern generic python application☆12Updated 5 months ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆174Updated this week
- ⚓ Eurybia monitors model drift over time and securizes model deployment with data validation☆205Updated last month
- A write-audit-publish implementation on a data lake without the JVM☆41Updated 3 months ago
- Python package implementing transformers for pre processing steps for machine learning.☆45Updated this week
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆127Updated 11 months ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆497Updated 2 months ago
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆79Updated this week
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆112Updated last week
- ☆26Updated 2 years ago
- Frouros: an open-source Python library for drift detection in machine learning systems.☆194Updated this week
- Set up a Cost-Effective Modern Data Stack for a Charity☆19Updated 8 months ago
- Generating Realistic Synthetic Data☆31Updated 9 months ago
- Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of p…☆319Updated last week
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆38Updated 2 weeks ago
- Assessing whether data from database complies with reference information.☆42Updated last week
- ☆54Updated 10 months ago
- A curated list of awesome open source tools and commercial products for monitoring data quality, monitoring model performance, and profil…☆64Updated 6 months ago
- ☆23Updated 5 months ago
- A tool to automatically infer columns data types in .csv files☆35Updated last year
- ☆13Updated last year
- Food for thoughts around data contracts☆24Updated last week
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 2 months ago
- Plugins, extensions, case studies, articles, and video tutorials for Kedro☆64Updated last month
- Proof-of-concept extension combining the delta extension with Unity Catalog☆54Updated this week
- Metrics to evaluate quality and efficacy of synthetic datasets.☆213Updated this week
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆26Updated this week
- A kedro plugin that streamlines the integration between Kedro projects and third-party applications, making it easier for you to develop…☆37Updated last week