Tutorial material on machine learning with dirty data in Python
☆61Jul 7, 2024Updated last year
Alternatives and similar repositories for python
Users that are interested in python are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the implementation of the Recursive Nearest (Neighbor) Agglomeration☆11Oct 9, 2020Updated 5 years ago
- eds-scikit is a Python library providing tools to process and analyse OMOP data☆45Dec 19, 2024Updated last year
- data⎰describe: Pythonic EDA Accelerator for Data Science☆302Feb 22, 2023Updated 3 years ago
- Machine learning with dataframes☆1,597Apr 23, 2026Updated last week
- Simulations for predictive model selection in causal inference☆13Jan 16, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Similarity encoding of dirty categorical variables (strings)☆20Jan 22, 2019Updated 7 years ago
- Blog posts I've created about python, pandas, and related topics as a series of notebooks.☆23Apr 5, 2023Updated 3 years ago
- Experiments for the NeurIPS 2021 paper "Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks"☆13Oct 25, 2021Updated 4 years ago
- ☆14Sep 8, 2023Updated 2 years ago
- This web crawler can be customized to scrape almost all types of websites.☆11Dec 31, 2021Updated 4 years ago
- Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath☆19Oct 28, 2017Updated 8 years ago
- [ICCVW2025] V-RoAst: A New Dataset for Visual Road Assessment☆11Dec 17, 2025Updated 4 months ago
- A simple librairy to build a vrt from multiple raster source relying only on rasterio☆14Dec 7, 2024Updated last year
- A Mermaid widget for interactively exploring Mermaid diagrams in notebooks and Panel data apps☆12Oct 25, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Resources for Survival Analysis☆101Jul 3, 2025Updated 9 months ago
- Microbenchmark testing Python, Numba, Mojo, Dart, C/gcc, Rust, Go, JavaScript, C#, Java, Kotlin, Pascal, Ruby, Haskell performance in Man…☆15Mar 26, 2025Updated last year
- ☆23Jan 27, 2022Updated 4 years ago
- simplify geographic shapes☆12Sep 21, 2015Updated 10 years ago
- Recipes for using Python's polars library☆278Sep 8, 2024Updated last year
- A Python package for calculating reference evapotranspiration☆14Mar 9, 2026Updated last month
- Code repo for "Transformer on a Diet" paper☆31Jun 22, 2020Updated 5 years ago
- ☆13Jan 23, 2019Updated 7 years ago
- ETNA – Time-Series Library☆885Aug 9, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An open-source NLP library: fast text cleaning and preprocessing☆23Nov 9, 2021Updated 4 years ago
- KEN: Relational Data Embeddings☆34Jan 2, 2024Updated 2 years ago
- My blog☆11Nov 17, 2020Updated 5 years ago
- 🧬 Modularised Evolutionary Algorithms For Python with Optional JIT and Multiprocessing (Ray) support. Inspired by PyTorch Lightning☆52Mar 29, 2023Updated 3 years ago
- Codes for paper "KNAS: Green Neural Architecture Search"☆93Nov 18, 2021Updated 4 years ago
- Master repository for the pandas-ml modules☆164Jul 23, 2023Updated 2 years ago
- A tutorial for setting a new machine with core data science tools☆296Jan 3, 2025Updated last year
- Fast sequence vectorization for metagenomics analysis. Converts input sequences into oligonucleotide frequency vectors, fast!☆14May 12, 2024Updated last year
- general functions for your data .pipe()-lines.☆17Nov 8, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Helpers for parameters in black-box optimization, tuning and machine learning.☆26Dec 22, 2024Updated last year
- Reproducible Data Science with Python☆36Jan 10, 2023Updated 3 years ago
- ☆16Jan 24, 2026Updated 3 months ago
- a GitHub action to run `pre-commit` with `uv`☆19Mar 24, 2026Updated last month
- Python package for downloading and formatting the UK's Road Safety Data.☆16Aug 3, 2024Updated last year
- Cell tracking for longitudinal calcium imaging recordings.☆15Dec 11, 2025Updated 4 months ago
- Python library for plotting bivariate choropleth maps☆41Apr 6, 2026Updated 3 weeks ago