capitalone / synthetic-dataLinks
Generating complex, nonlinear datasets appropriate for use with deep learning/black box models which 'need' nonlinearity
☆44Updated last year
Alternatives and similar repositories for synthetic-data
Users that are interested in synthetic-data are comparing it to the libraries listed below
Sorting:
- Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!☆135Updated 2 weeks ago
- A software engineering framework to jump start your machine learning projects☆37Updated last year
- A library of Reversible Data Transforms☆127Updated last week
- openclean - Data Cleaning and data profiling library for Python☆80Updated 3 years ago
- An abstraction layer for parameter tuning☆35Updated 11 months ago
- Assessing whether data from database complies with reference information.☆43Updated this week
- Type System for Data Analysis in Python☆213Updated 6 months ago
- real-time data + ML pipeline☆54Updated this week
- Start a data science project with modern tools☆199Updated 2 years ago
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆54Updated last month
- GAM (Global Attribution Mapping) explains the landscape of neural network predictions across subpopulations☆34Updated 3 weeks ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Experimental MLflow plugin for Google Cloud Vertex AI☆38Updated 2 months ago
- A library to find and visualise the most interesting slices in multidimensional data☆109Updated 4 months ago
- The easiest way to integrate Kedro and Great Expectations☆53Updated 2 years ago
- Dask integration for Snowflake☆30Updated last week
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated 10 months ago
- Metafeature Extraction for Unstructured Data☆102Updated 4 months ago
- Abstractions for feature engineering on large graphs of tabular data.☆22Updated 2 months ago
- Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- Kedro-Accelerator speeds up pipelines by parallelizing I/O in the background.☆36Updated 3 years ago
- Repository for the ML Technology Readiness Levels framework☆39Updated last year
- Playground for using large language models into the Modern Data Stack for entity matching☆108Updated 2 years ago
- A playground for running duckdb as a stateless query engine over a data lake☆210Updated last year
- Woodwork is a Python library that provides robust methods for managing and communicating data typing information.☆155Updated 3 weeks ago
- ByteHub: making feature stores simple☆61Updated 4 years ago
- DataFrame support for scikit-learn.☆63Updated 2 weeks ago
- A toolbox 🧰 for Jupyter notebooks 📙: testing, experiment tracking, debugging, profiling, and more!☆67Updated 10 months ago
- Explore and compare 1K+ accurate decision trees in your browser!☆164Updated last year