pnavaro / big-data
Python tools for big data
☆53Updated last year
Alternatives and similar repositories for big-data
Users that are interested in big-data are comparing it to the libraries listed below
Sorting:
- Wrap-up to automatically tune xgboost in Python.☆80Updated 3 years ago
- An abstraction layer for parameter tuning☆35Updated 8 months ago
- Templates for jupyter notebooks☆145Updated last year
- Altair backend for pandas plotting☆102Updated 4 years ago
- 💫 PyScaffold extension for data-science projects☆158Updated last month
- Source code to reproduce experiments from the article Practitioner’s Guide to Statistical Tests☆204Updated 2 years ago
- ☆12Updated last year
- A little benchmark comparing Pandas data frames serialization formats☆43Updated 6 years ago
- Phi_K correlation analyzer library☆164Updated 3 months ago
- Repository for the book Fast Python - published by Manning☆96Updated last week
- This Repository contains the material for the tutorial "Introduction to MLOps with MLflow" held at pyData/pyCon Berlin 2022.☆23Updated 3 years ago
- A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using th…☆115Updated 2 years ago
- stratx is a library for A Stratification Approach to Partial Dependence for Codependent Variables☆66Updated last year
- Graph-Based Clustering using connected components and spanning trees.☆25Updated 3 years ago
- Clustergram - Visualization and diagnostics for cluster analysis in Python☆125Updated last month
- Tool for whitebox (binning + logreg) model development☆77Updated 3 years ago
- Simple utility that retrieves current Jupyter notebook filename or path, when run from Jupyter notebook.☆60Updated 9 months ago
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- Use pathlib syntax to easily work with Pandas series containing file paths.☆69Updated last year
- Foundational tools for BCG X's data science packages.☆35Updated 9 months ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆52Updated 4 years ago
- ☆20Updated 2 years ago
- Code samples for the Effective Data Science Infrastructure book☆115Updated last year
- Data Analysis Baseline Library☆132Updated 6 months ago
- CraftML is a restful web service for easy pipeline creation without code.☆13Updated 4 years ago
- Increase citations, ease review & collaboration A collection of "easy wins" to make machine learning in research reproducible. This tut…☆74Updated 5 months ago
- Tutorial material on machine learning with dirty data in Python☆60Updated 10 months ago
- ☆9Updated 5 years ago
- DataFrame support for scikit-learn.☆63Updated last year
- Utilities for monitoring and interacting with Jupyter Notebooks☆38Updated 9 months ago