capitalone / synthetic-dataLinks
Generating complex, nonlinear datasets appropriate for use with deep learning/black box models which 'need' nonlinearity
☆44Updated last year
Alternatives and similar repositories for synthetic-data
Users that are interested in synthetic-data are comparing it to the libraries listed below
Sorting:
- Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!☆134Updated this week
- Assessing whether data from database complies with reference information.☆43Updated last week
- A software engineering framework to jump start your machine learning projects☆37Updated last year
- Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
- Tools and utilities for operating Metaflow in production☆58Updated 3 weeks ago
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Dask integration for Snowflake☆30Updated 8 months ago
- real-time data + ML pipeline☆54Updated last week
- A library to find and visualise the most interesting slices in multidimensional data☆109Updated 3 months ago
- GAM (Global Attribution Mapping) explains the landscape of neural network predictions across subpopulations☆34Updated 2 months ago
- Explore and compare 1K+ accurate decision trees in your browser!☆165Updated last year
- DataFrame support for scikit-learn.☆63Updated this week
- An abstraction layer for parameter tuning☆35Updated 10 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆111Updated this week
- The ML-airport-configuration software is developed to provide a reference implementation to serve as a research example how to train and …☆28Updated 3 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Your favorite Python graph libraries, scalable and interoperable. Graph databases in memory, and familiar graph APIs for cloud databases.☆110Updated last month
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆54Updated 2 weeks ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated 2 years ago
- Templates for your Kedro projects.☆77Updated this week
- A library of Reversible Data Transforms☆127Updated this week
- Public blueprints for data use cases☆79Updated this week
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- Abstractions for feature engineering on large graphs of tabular data.☆21Updated last month
- Buy Till You Die and Customer Lifetime Value statistical models in Python.☆117Updated last year
- SPEAR: Programmatically label and build training data quickly.☆107Updated last year
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 11 months ago
- openclean - Data Cleaning and data profiling library for Python☆78Updated 3 years ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projects☆107Updated 2 years ago
- Metafeature Extraction for Unstructured Data☆102Updated 4 months ago