mfcabrera / hooqu
hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
☆27Updated 3 months ago
Alternatives and similar repositories for hooqu:
Users that are interested in hooqu are comparing it to the libraries listed below
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 6 months ago
- An abstraction layer for parameter tuning☆35Updated 6 months ago
- A python library bakeoff for medium sized datasets☆24Updated last year
- ☆27Updated last year
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆104Updated 2 months ago
- BigQuery backend for Ibis☆19Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Primrose modeling framework for simple production models☆33Updated last year
- Demo repository to lambda-fy your dbt runs☆11Updated last year
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆28Updated last week
- Data Catalog for Databases and Data Warehouses☆33Updated last year
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- Assessing whether data from database complies with reference information.☆42Updated last week
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 7 months ago
- A software engineering framework to jump start your machine learning projects☆37Updated 9 months ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- real-time data + ML pipeline☆54Updated last month
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆75Updated last year
- The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.☆20Updated last year
- ☆11Updated last year
- 🎯 kettle is a CLI tool for creating and deploying cloud functions & docker containers for machine learning☆32Updated 2 years ago
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated last year
- Fast, resilient and reproducible data analysis with cached SQL queries☆30Updated last year
- Kedro Plugin to support running workflows on GCP Vertex AI Pipelines☆36Updated this week
- Linear regression in SQL using dbt☆68Updated 2 months ago