AlexFrid / anonymizedf
a convenient way to anonymize your data for analytics
☆20Updated 3 years ago
Alternatives and similar repositories for anonymizedf:
Users that are interested in anonymizedf are comparing it to the libraries listed below
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆49Updated last year
- ☆30Updated last year
- An abstraction layer for parameter tuning☆36Updated 4 months ago
- Cost Efficient Data Pipelines with DuckDB☆48Updated 5 months ago
- Demo on how to use Prefect with Docker☆25Updated 2 years ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 2 years ago
- Check the basic quality of any dataset☆11Updated 3 years ago
- The easiest way to integrate Kedro and Great Expectations☆53Updated 2 years ago
- Fake Pandas / PySpark DataFrame creator☆44Updated 10 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated 9 months ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆25Updated 2 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdb☆19Updated last year
- dagster scikit-learn pipeline example.☆44Updated last year
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 3 years ago
- ☆11Updated last year
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆15Updated 4 years ago
- ☆12Updated 6 months ago
- Pandas helper functions☆30Updated last year
- mlctl is the control plane for MLOps. It provides a CLI and a Python SDK for supporting key operations related to MLOps, such as "model t…☆25Updated 3 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated last year
- Record matching and entity resolution at scale in Spark☆32Updated last year
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- A fully-featured multi-source data pipeline for continuously extracting knowledge from COVID-19 data.☆21Updated 3 years ago
- A streamlit component to embed Disqus in your applications.☆11Updated 3 years ago
- A Python Package for Visualizing Categorical Data Over Time☆41Updated 7 months ago
- DuckDB SQL Tools add DuckDB support to VSCode, and provide database schema and SQL query interfaces for the popular SQLTools extension, S…☆13Updated 6 months ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆81Updated 8 months ago
- ☆35Updated 2 months ago