Quantmetry / pipeasy-spark
an easy way to define preprocessing data pipeline (similar to sklean-pandas but for Spark ML)
☆17Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for pipeasy-spark
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library☆50Updated 2 years ago
- General Interpretability Package☆58Updated last year
- ⬛ Python Individual Conditional Expectation Plot Toolbox☆165Updated 4 years ago
- Embed categorical variables via neural networks.☆59Updated last year
- A toolbox for fair and explainable machine learning☆53Updated 4 months ago
- Surrogate Assisted Feature Extraction☆36Updated 3 years ago
- Visualization ideas for data science☆19Updated 6 years ago
- Hierarchical Time Series Forecasting with a familiar API☆222Updated last year
- Repo for the ML_Insights python package☆145Updated last year
- ACV is a python library that provides explanations for any machine learning model or data. It gives local rule-based explanations for any…☆102Updated 2 years ago
- A tool for analyzing feature importance of xgboost model. idea came from R version xgboostExplainer☆42Updated 6 years ago
- A simple, extensible library for developing AutoML systems☆172Updated last year
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated last year
- 🍦 Deployment tool for online machine learning models☆97Updated 2 years ago
- Hierarchical Time Series Forecasting using Prophet☆142Updated 3 years ago
- scikit-learn-compatible estimators from Civis Analytics☆59Updated 3 years ago
- Autoregressive Bayesian linear model☆21Updated 4 years ago
- xverse (XuniVerse) is collection of transformers for feature engineering and feature selection☆116Updated last year
- this repo might get accepted☆29Updated 3 years ago
- Better `keras` models for time series and beyond☆60Updated 9 months ago
- Model Error Analysis for scikit-learn models.☆28Updated 2 years ago
- An extension of CatBoost to probabilistic modelling☆141Updated last year
- Python implementation of R package breakDown☆41Updated last year
- Feature selection package based on SHAP and target permutation, for pandas and Spark☆30Updated 2 years ago
- Tutorial for a new versioning Machine Learning pipeline☆81Updated 3 years ago
- Logistic regression with bound and linear constraints. L1, L2 and Elastic-Net regularization.☆33Updated last year
- Validation (like Recursive Feature Elimination for SHAP) of (multiclass) classifiers & regressors and data used to develop them.☆132Updated 2 months ago
- Home of the PipeGraph extension to Scikit-Learn☆24Updated last year
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆165Updated 2 months ago
- ☆47Updated 6 years ago