pnavaro / big-data
Python tools for big data
☆53Updated last year
Alternatives and similar repositories for big-data:
Users that are interested in big-data are comparing it to the libraries listed below
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆83Updated last year
- Notebooks that support blog posts and tech talks on Dask / Coiled.☆47Updated 3 weeks ago
- Public notebooks and datasets to accompany the Data Analysis with Polars course on Udemy☆43Updated last year
- Wrap-up to automatically tune xgboost in Python.☆80Updated 3 years ago
- ☆11Updated last year
- This Repository contains the material for the tutorial "Introduction to MLOps with MLflow" held at pyData/pyCon Berlin 2022.☆23Updated 2 years ago
- Tool for whitebox (binning + logreg) model development☆77Updated 3 years ago
- ☆69Updated 2 weeks ago
- A package for interactive visual analysis in jupyter notebooks☆22Updated 2 years ago
- How to Interpret SHAP Analyses: A Non-Technical Guide☆52Updated 3 years ago
- Data Analysis Baseline Library☆131Updated 5 months ago
- An abstraction layer for parameter tuning☆35Updated 6 months ago
- Pre-Modelling Analysis of the data, by doing various exploratory data analysis and Statistical Test.☆51Updated last year
- ☆45Updated last year
- Adding timestamps to NumFOCUS and PyData YouTube videos!☆87Updated 2 years ago
- Source Code for 'Modern Deep Learning for Tabular Data' by Andre Ye and Ziang Wang☆30Updated 2 years ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆19Updated 3 years ago
- Start a data science project with modern tools☆192Updated last year
- Altair backend for pandas plotting☆102Updated 4 years ago
- 💫 PyScaffold extension for data-science projects☆158Updated 3 weeks ago
- Jupyter Widget for Lux☆76Updated 2 years ago
- A repository used to provide an introduction to dataviz in Python☆53Updated 2 years ago
- Exploratory repository to study predictive survival analysis models☆34Updated last year
- A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using th…☆115Updated 2 years ago
- Pandas helper functions☆30Updated 2 years ago
- Code and materials for Effective Polars book☆75Updated 11 months ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆51Updated 4 years ago
- Exploring some issues related to churn☆16Updated last year
- Tutorial material on machine learning with dirty data in Python☆61Updated 8 months ago
- Increase citations, ease review & collaboration A collection of "easy wins" to make machine learning in research reproducible. This tut…☆74Updated 3 months ago