dirty-cat / dirty_cat
Machine learning on dirty tabular data (legacy clone of skrub)
☆17Updated 2 months ago
Alternatives and similar repositories for dirty_cat:
Users that are interested in dirty_cat are comparing it to the libraries listed below
- implementation of Cyclic Boosting machine learning algorithms☆89Updated 5 months ago
- Rethinking machine learning pipelines☆28Updated 2 months ago
- A proof-of-concept for a RAG to query the scikit-learn documentation☆26Updated this week
- DataFrame support for scikit-learn.☆62Updated last year
- Tools for diagnostics and assessment of (machine learning) models☆34Updated this week
- An abstraction layer for parameter tuning☆35Updated 5 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆100Updated last month
- Exploratory repository to study predictive survival analysis models☆31Updated last year
- Time based splits for cross validation☆35Updated 2 weeks ago
- ☆41Updated 7 months ago
- Competing Risks and Survival Analysis☆68Updated this week
- Assessing whether data from database complies with reference information.☆42Updated this week
- Kedro extension for VSCode including LSP and other features☆19Updated this week
- Function decorators for Pandas Dataframe column name and data type validation☆16Updated last week
- Python package implementing transformers for pre processing steps for machine learning.☆54Updated last week
- skchange provides sktime-compatible change detection and changepoint-based anomaly detection algorithms☆22Updated 2 weeks ago
- Identifiers and Standard Format Parsing for Polars Dataframe☆14Updated this week
- MetaLearners for CATE estimation☆36Updated this week
- mlmachine accelerates machine learning experimentation☆30Updated 3 years ago
- Causal Impact but with MFLES and conformal prediction intervals☆34Updated last month
- Sentiment and language detection for text analytics.☆16Updated 7 months ago
- Polars plugin for pairwise distance functions☆62Updated 2 months ago
- Fast window operations☆39Updated 8 months ago
- High performance Python GLMs with all the features!☆326Updated this week
- general functions for your data .pipe()-lines.☆16Updated last year
- A grammar of data manipulation for pandas inspired by tidyverse☆97Updated 11 months ago
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 5 months ago
- Explainable Boosted Scoring☆14Updated 5 months ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago