Clearbox-AI / StructuredDataProfiling
A Python library to check for data quality and automatically generate data tests.
☆42Updated last year
Alternatives and similar repositories for StructuredDataProfiling
Users that are interested in StructuredDataProfiling are comparing it to the libraries listed below
Sorting:
- A Python library to perform NER on structured data and generate PII with Faker☆30Updated 11 months ago
- An open-source Python library for the assessment of utility and privacy performance of any tabular synthetic dataset.☆23Updated this week
- Clearbox AI's all-in-one solution for generation and evaluation of synthetic tabular and time-series data.☆42Updated this week
- Python Biella Group basic template for a modern generic python application☆12Updated 3 weeks ago
- Modern Data Engineering Project☆12Updated 2 years ago
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆130Updated last year
- An agnostic wrapper for the most common ML frameworks.☆14Updated 3 years ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆188Updated this week
- Data Quality assessment with one line of code☆442Updated this week
- Repository con materiale delle lezioni e degli argomenti affrontati☆32Updated last month
- ☆16Updated last month
- prefect integration for running dbt☆61Updated 8 months ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆500Updated 3 months ago
- ☆53Updated this week
- A Python library to test your SQL models using mocked input data☆45Updated last year
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆220Updated 2 weeks ago
- A tool to automatically infer columns data types in .csv files☆35Updated 2 years ago
- dagster scikit-learn pipeline example.☆44Updated 2 years ago
- Data Tools Subjective List☆83Updated last year
- Set up a Cost-Effective Modern Data Stack for a Charity☆19Updated last month
- IbisML is a library for building scalable ML pipelines using Ibis.☆108Updated 4 months ago
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 9 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Linear regression in SQL using dbt☆70Updated 4 months ago
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆83Updated this week
- Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖☆335Updated last year
- ☆16Updated last year
- Generate the ERD as a code from dbt artifacts☆248Updated 3 weeks ago
- Tutorial for implementing data validation in data science pipelines☆33Updated 2 years ago
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆54Updated 8 months ago