Similarity encoding of dirty categorical variables (strings)
☆20Jan 22, 2019Updated 7 years ago
Alternatives and similar repositories for spark-dirty-cat
Users that are interested in spark-dirty-cat are comparing it to the libraries listed below
Sorting:
- This is the implementation of the Recursive Nearest (Neighbor) Agglomeration☆11Oct 9, 2020Updated 5 years ago
- ☆29Jan 23, 2019Updated 7 years ago
- Interactive parametric benchmarks in Python☆17Apr 18, 2021Updated 4 years ago
- Db2 JSON Examples using Jupyter Notebooks☆32Jul 14, 2021Updated 4 years ago
- A short introduction to time series methods☆30Sep 19, 2019Updated 6 years ago
- The ONS Big Data Team Github pages☆10May 19, 2021Updated 4 years ago
- ☆10Aug 23, 2023Updated 2 years ago
- Parent repository for the MOJ Analytics Platform☆14Nov 16, 2021Updated 4 years ago
- This repository contains material of a teaching innovation project in Universitat de Barcelona: "Intelligent Support System for Tutor of …☆10Jun 30, 2020Updated 5 years ago
- A toolkit of functions and classes to help build isometric games with Lua☆16Apr 21, 2025Updated 10 months ago
- ☆11Jan 28, 2019Updated 7 years ago
- A library for Partially Homomorphic Encryption in Python☆12May 30, 2017Updated 8 years ago
- Reproducible Analytical Pipeline of the Hospital Standardised Mortality Ratio (HSMR) quarterly publication☆11Jun 21, 2024Updated last year
- Multimodal data loader compatible with pytorch and tensorflow☆12Aug 14, 2024Updated last year
- Curso de Machine Learning☆11Apr 22, 2018Updated 7 years ago
- A JupyterHub authenticator using Kerberos☆12Jul 16, 2019Updated 6 years ago
- A simple NER implementation using a DistilBERT based model with ML.NET☆13May 6, 2021Updated 4 years ago
- An offline evaluation framework for sequence-based recommender systems☆13May 17, 2019Updated 6 years ago
- Feature-level domain adaptation☆11Sep 6, 2019Updated 6 years ago
- Implementation of Grid Search to find better hyper-parameters for decision tree to reduce the over fitting.☆12May 29, 2021Updated 4 years ago
- An example of graph embeddings for wikipedia page recommendations☆11Aug 26, 2021Updated 4 years ago
- KumuluzEE REST extension for implementation of common, advanced and flexible REST API functionalities and patterns as microservices.☆10Jan 12, 2026Updated last month
- Collaborative Filtering Autoencoder Neural Network☆10Jun 27, 2018Updated 7 years ago
- The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.☆16May 22, 2024Updated last year
- Demo of an In-database processing tool for scikit-learn☆13Oct 18, 2022Updated 3 years ago
- A simple python library to spot holiday "bridges" and long weekends.☆10Aug 19, 2021Updated 4 years ago
- Code to implement the network histogram (Olhede and Wolfe, arXiv:1312.5306)☆11Sep 23, 2014Updated 11 years ago
- Public repo containing code to train, visualize, and evaluate semi-supervised topic models and baselines for regression/classification on…☆12Apr 15, 2020Updated 5 years ago
- Erlang Parser Combinator library for Parser Expression Grammars ( PEG)☆13Jun 13, 2015Updated 10 years ago
- JSON tools around `jq` and other utilities☆13May 12, 2019Updated 6 years ago
- ☆11Nov 10, 2017Updated 8 years ago
- Openscoring application for the Docker distributed applications platform☆12Nov 8, 2020Updated 5 years ago
- Nim wrapper for librdkafka☆10Dec 28, 2023Updated 2 years ago
- ☆11Jul 26, 2020Updated 5 years ago
- Utilizing AutoXGB for Credit Card Financial Fraud Detection☆12Dec 1, 2021Updated 4 years ago
- Example that uses GraalVM native image to create a shared library, callable from C, Ruby or other ecosystems that support foreign functio…☆11Mar 31, 2023Updated 2 years ago
- ☆12Nov 14, 2024Updated last year
- Install micromamba, and optionally create a base conda environment.☆10Apr 5, 2025Updated 11 months ago
- Pre-train Embedding in LightFM Recommender System Framework☆11Apr 28, 2019Updated 6 years ago