pnavaro / big-data
Python tools for big data
☆51Updated 11 months ago
Related projects: ⓘ
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆82Updated 8 months ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆52Updated 4 years ago
- Phi_K correlation analyzer library☆155Updated last week
- Data Analysis Baseline Library☆130Updated 8 months ago
- Jupyter Widget for Lux☆72Updated last year
- Public notebooks and datasets to accompany the Data Analysis with Polars course on Udemy☆39Updated last year
- Notebooks that support blog posts and tech talks on Dask / Coiled.☆42Updated 7 months ago
- A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using th…☆112Updated last year
- Python port of "Common statistical tests are linear models" by Jonas Kristoffer Lindeløv.☆87Updated last month
- This repository contains materials for AC295 fall 2020☆19Updated 3 years ago
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆124Updated 9 months ago
- How to Interpret SHAP Analyses: A Non-Technical Guide☆42Updated 2 years ago
- big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 4 years ago
- Explorations of survival analysis in Python☆50Updated last year
- Code and materials for Effective Polars book☆63Updated 5 months ago
- PyData London 2022 Tutorial☆65Updated 2 years ago
- pipreqs with jupyter notebook support☆64Updated last year
- Foundational tools for BCG X's data science packages.☆33Updated last month
- Learn Python through Data Processing in Pandas Tutorial☆38Updated 4 years ago
- Talks about vaex☆36Updated last year
- DataFrame support for scikit-learn.☆63Updated 10 months ago
- The Data Science Interview Book☆31Updated 6 months ago
- A short example showing how to write a lecture series using Jupyter Book 2.0.☆35Updated 2 years ago
- Dockerfile templates for creating RAPIDS Docker Images☆69Updated last week
- Sensible multi-core apply function for Pandas☆76Updated 2 weeks ago
- 💫 PyScaffold extension for data-science projects☆155Updated last month
- Wrapper around Google APIs to create charts in Google Slides with python☆30Updated 2 years ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projects☆105Updated last year
- Tutorials on creating a reproducible and maintainable data science project☆135Updated 2 years ago
- Altair backend for pandas plotting☆101Updated 3 years ago