rakutentech / spark-dirty-cat
Similarity encoding of dirty categorical variables (strings)
☆19Updated 5 years ago
Related projects: ⓘ
- 🧮 Extended Latent Dirichlet Allocation for Collaborative Filtering in Recommender Systems.☆40Updated 2 years ago
- Helpers for scikit learn☆16Updated last year
- Pipeline components that support partial_fit.☆42Updated 2 months ago
- ☆28Updated 5 years ago
- Extra functionalities for river☆14Updated 4 months ago
- In-Session Personalization Workshop for eCommerce, April 2021, and the MICES Workshop in June 2021.☆21Updated 3 years ago
- this repo might get accepted☆29Updated 3 years ago
- Prune your sklearn models☆19Updated last year
- scikit-learn gradient-boosting-model interactions☆25Updated last year
- Scripts for paper "Encoding high-cardinality string categorical variables"☆24Updated 5 years ago
- Python package for Bayesian & Frequentist A/B Testing☆11Updated last year
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- Gradient boosting on steroids☆26Updated 3 months ago
- How to use SHAP values for better cluster analysis☆51Updated 2 years ago
- 🪜 Bayesian Hierarchical Models at Scale☆50Updated 3 years ago
- introduction class to recommendation systems☆22Updated 5 years ago
- ☆67Updated this week
- Python library to explain Tree Ensemble models (TE) like XGBoost, using a rule list.☆40Updated 4 months ago
- Ordinal regression in Python☆65Updated 3 months ago
- Cyclic Boosting Machines - an explainable supervised machine learning algorithm☆57Updated 2 weeks ago
- Python implementation of R package breakDown☆41Updated last year
- Record matching and entity resolution at scale in Spark☆31Updated 10 months ago
- ☆27Updated 2 years ago
- A fast numpy-based implementation of ranking metrics for information retrieval and recommendation.☆31Updated 2 years ago
- Repository for my master thesis on automated string handling☆16Updated 3 years ago
- Logistic regression with bound and linear constraints. L1, L2 and Elastic-Net regularization.☆33Updated last year
- 📈🔍 Lets Python do AB testing analysis☆75Updated 5 months ago
- Surrogate Assisted Feature Extraction☆35Updated 3 years ago
- Fast Bayesian A/B and Multivariate testing.☆36Updated last year
- CBM Encoding☆19Updated 3 years ago