rakutentech / spark-dirty-cat
Similarity encoding of dirty categorical variables (strings)
โ20Updated 6 years ago
Alternatives and similar repositories for spark-dirty-cat
Users that are interested in spark-dirty-cat are comparing it to the libraries listed below
Sorting:
- ๐งฎ Extended Latent Dirichlet Allocation for Collaborative Filtering in Recommender Systems.โ41Updated 2 years ago
- Extra functionalities for riverโ14Updated 11 months ago
- Helpers for scikit learnโ16Updated 2 years ago
- Scripts for paper "Encoding high-cardinality string categorical variables"โ24Updated 5 years ago
- โ28Updated 6 years ago
- Cyclic Boosting Machines - an explainable supervised machine learning algorithmโ60Updated 8 months ago
- Pipeline components that support partial_fit.โ46Updated 9 months ago
- scikit-learn gradient-boosting-model interactionsโ25Updated 2 years ago
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python libraryโ51Updated 2 years ago
- Prune your sklearn modelsโ19Updated 6 months ago
- Implementation of algorithms from the paper "Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Applicationโฆโ25Updated 2 years ago
- ๐ช Bayesian Hierarchical Models at Scaleโ52Updated 3 years ago
- ๐๐ Lets Python do AB testing analysis.โ76Updated 3 weeks ago
- My collection of causal inference algorithms built on top of accessible, simple, out-of-the-box ML methods, aimed at being explainable anโฆโ30Updated 2 years ago
- Exploratory repository to study predictive survival analysis modelsโ34Updated last year
- this repo might get acceptedโ28Updated 4 years ago
- Embed categorical variables via neural networks.โ59Updated 2 years ago
- Fast Bayesian A/B and Multivariate testing.โ36Updated 2 years ago
- In-Session Personalization Workshop for eCommerce, April 2021, and the MICES Workshop in June 2021.โ22Updated 3 years ago
- Spark implementation of computing Shapley Values using monte-carlo approximationโ74Updated 2 years ago
- Paper and talk from KDD 2019 XAI Workshopโ20Updated 4 years ago
- Distributed, large-scale, benchmarking framework for rigorous assessment of automatic machine learning repositories, projects, and librarโฆโ30Updated 2 years ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard includedโ28Updated 3 years ago
- Record matching and entity resolution at scale in Sparkโ34Updated last year
- Python implementation of R package breakDownโ43Updated last year
- An experiment on explicit vs implicit feedback recommendersโ25Updated 7 years ago
- Python implementation of "Content-based recommendations with poisson factorization", with some extensionsโ30Updated last year
- Gradient boosting on steroidsโ28Updated 10 months ago
- Python package to visualize and cluster partial dependence.โ28Updated 3 years ago
- Python package for Bayesian Tests / AB Testingโ40Updated 4 years ago