rakutentech / spark-dirty-catLinks
Similarity encoding of dirty categorical variables (strings)
โ20Updated 6 years ago
Alternatives and similar repositories for spark-dirty-cat
Users that are interested in spark-dirty-cat are comparing it to the libraries listed below
Sorting:
- โ29Updated 6 years ago
- ๐งฎ Extended Latent Dirichlet Allocation for Collaborative Filtering in Recommender Systems.โ42Updated 3 years ago
- Pipeline components that support partial_fit.โ46Updated 11 months ago
- scikit-learn gradient-boosting-model interactionsโ25Updated 2 years ago
- Helpers for scikit learnโ16Updated 2 years ago
- Python implementation of R package breakDownโ43Updated last year
- Scripts for paper "Encoding high-cardinality string categorical variables"โ24Updated 5 years ago
- Record matching and entity resolution at scale in Sparkโ34Updated last year
- ๐ช Bayesian Hierarchical Models at Scaleโ52Updated 3 years ago
- Official Repository for EvalRS @ KDD 2023: a Rounded Evaluation of Recommender Systemsโ30Updated last year
- A fast numpy-based implementation of ranking metrics for information retrieval and recommendation.โ32Updated 2 years ago
- Cyclic Boosting Machines - an explainable supervised machine learning algorithmโ61Updated 9 months ago
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python libraryโ51Updated 2 years ago
- Spark implementation of computing Shapley Values using monte-carlo approximationโ74Updated 2 years ago
- Exploratory repository to study predictive survival analysis modelsโ34Updated 2 years ago
- Estimators to perform off-policy evaluationโ13Updated 9 months ago
- XAI Stories. Case studies for eXplainable Artificial Intelligenceโ29Updated 4 years ago
- How to use SHAP values for better cluster analysisโ57Updated 3 years ago
- Gradient boosting on steroidsโ28Updated last year
- Tweedie family density estimation in pythonโ28Updated last year
- Scripts for ECML PKDD 2018 article: Similarity encoding for learning with dirty categorical variablesโ11Updated 7 years ago
- Prune your sklearn modelsโ19Updated 7 months ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard includedโ28Updated 3 years ago
- Python library for Ceteris Paribus Plots (What-if plots)โ24Updated 4 years ago
- this repo might get acceptedโ28Updated 4 years ago
- Python package for Bayesian Tests / AB Testingโ40Updated 4 years ago
- Sparrow is a boosting algorithm implementation that is optimized for training on very large datasets and/or in the limited memory settingโฆโ21Updated 4 years ago
- Developmental tools to detect data driftโ16Updated last year
- In which I play with the ideas surrounding causalityโ52Updated 2 years ago
- In-Session Personalization Workshop for eCommerce, April 2021, and the MICES Workshop in June 2021.โ22Updated 3 years ago