capitalone / synthetic-dataLinks
Generating complex, nonlinear datasets appropriate for use with deep learning/black box models which 'need' nonlinearity
☆44Updated 11 months ago
Alternatives and similar repositories for synthetic-data
Users that are interested in synthetic-data are comparing it to the libraries listed below
Sorting:
- Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!☆132Updated this week
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Abstractions for feature engineering on large graphs of tabular data.☆21Updated last week
- GAM (Global Attribution Mapping) explains the landscape of neural network predictions across subpopulations☆34Updated last month
- What's in your data? Extract schema, statistics and entities from datasets☆1,492Updated 2 months ago
- Assessing whether data from database complies with reference information.☆43Updated this week
- openclean - Data Cleaning and data profiling library for Python☆79Updated 3 years ago
- Type System for Data Analysis in Python☆212Updated 4 months ago
- A Kedro plugin that provides pandas dropin replacements for the pandas datasets (e.g modin and cuDF)☆12Updated 4 years ago
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆54Updated 9 months ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆501Updated 4 months ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆29Updated 2 years ago
- A library of Reversible Data Transforms☆127Updated this week
- Playground for using large language models into the Modern Data Stack for entity matching☆107Updated 2 years ago
- Demo repository to lambda-fy your dbt runs☆11Updated last year
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated 2 years ago
- ☆58Updated last year
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- An abstraction layer for parameter tuning☆35Updated 9 months ago
- Tutorials for YData's Fabric platform☆33Updated 3 weeks ago
- Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark☆76Updated 2 years ago
- This sample demonstrates how to setup an Amazon SageMaker MLOps end-to-end pipeline for Drift detection☆60Updated last year
- Metrics to evaluate quality and efficacy of synthetic datasets.☆236Updated this week
- PyPi module for Graphlet AI Knowledge Graph Factory☆29Updated 2 years ago
- A dbt package designed to help SQL based analysis of graphs☆20Updated last year
- DataFrame support for scikit-learn.☆63Updated last year
- Projects developed by Domino's R&D team☆76Updated 3 years ago
- Buy Till You Die and Customer Lifetime Value statistical models in Python.☆117Updated last year
- First-party plugins maintained by the Kedro team.☆103Updated this week
- Probabilistic type inference☆29Updated 3 years ago