mfcabrera / hooqu
hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
☆29Updated 4 months ago
Alternatives and similar repositories for hooqu:
Users that are interested in hooqu are comparing it to the libraries listed below
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 7 months ago
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆28Updated last month
- ☆32Updated last year
- 🚕 Self-contained demo using Redpanda, Materialize, River, Redis, and Streamlit to predict taxi trip durations☆47Updated 2 years ago
- Primrose modeling framework for simple production models☆32Updated last year
- Record matching and entity resolution at scale in Spark☆34Updated last year
- A software engineering framework to jump start your machine learning projects☆37Updated 9 months ago
- Data Catalog for Databases and Data Warehouses☆33Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- ☆30Updated 3 years ago
- ☆22Updated last month
- 🎯 kettle is a CLI tool for creating and deploying cloud functions & docker containers for machine learning☆32Updated 2 years ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Batteries included toolkit for data engineering.☆34Updated 3 months ago
- An abstraction layer for parameter tuning☆35Updated 7 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆108Updated 3 months ago
- Assessing whether data from database complies with reference information.☆42Updated last week
- A python library bakeoff for medium sized datasets☆24Updated last year
- Demos of Materialize, the operational data warehouse.☆51Updated last month
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆77Updated last year
- Projects developed by Domino's R&D team☆76Updated 3 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- ☆29Updated last year
- Distribution transparent Machine Learning experiments on Apache Spark☆90Updated last year
- Similarity encoding of dirty categorical variables (strings)☆20Updated 6 years ago
- ☀️🦶 A lightweight framework for collaborative, open-source feature engineering☆32Updated 3 years ago
- Dask integration for Snowflake☆30Updated 5 months ago