dirty-data-science / pythonLinks
Tutorial material on machine learning with dirty data in Python
☆60Updated 11 months ago
Alternatives and similar repositories for python
Users that are interested in python are comparing it to the libraries listed below
Sorting:
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for …☆106Updated 3 years ago
- data⎰describe: Pythonic EDA Accelerator for Data Science☆301Updated 2 years ago
- Python port of "Common statistical tests are linear models" by Jonas Kristoffer Lindeløv.☆94Updated 10 months ago
- Phi_K correlation analyzer library☆164Updated 4 months ago
- Data Analysis Baseline Library☆132Updated 8 months ago
- Hypergol is a Data Science/Machine Learning productivity toolkit to accelerate any projects into production with autogenerated code, stan…☆53Updated 2 years ago
- Clusteval provides methods for unsupervised cluster validation☆60Updated 2 months ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projects☆107Updated 2 years ago
- bayes-toolbox☆93Updated this week
- Makes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking☆55Updated 3 years ago
- CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system☆77Updated 2 years ago
- A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using th…☆115Updated 2 years ago
- General Interpretability Package☆58Updated 2 years ago
- SciKIt-learn Pipeline in PAndas☆42Updated last year
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library☆51Updated 2 years ago
- Clustergram - Visualization and diagnostics for cluster analysis in Python☆126Updated 2 months ago
- Clustering for mixed-type data☆99Updated 10 months ago
- Advanced random forest methods in Python☆57Updated last year
- Missing data amputation and exploration functions for Python☆71Updated 2 years ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆52Updated 4 years ago
- Notebooks for Keras Tutorial presented at ODSC West 2020☆26Updated 4 years ago
- Exploratory repository to study predictive survival analysis models☆34Updated 2 years ago
- Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.☆105Updated 6 years ago
- PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.☆77Updated 2 weeks ago
- A curated list of Python libraries used for data science.☆89Updated last year
- An abstraction layer for parameter tuning☆35Updated 9 months ago
- This repo has moved to https://github.com/INRIA/scikit-learn-mooc/☆42Updated 5 years ago
- TigerLily: Finding drug interactions in silico with the Graph.☆100Updated 2 years ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago