pnavaro / big-dataLinks
Python tools for big data
☆53Updated last year
Alternatives and similar repositories for big-data
Users that are interested in big-data are comparing it to the libraries listed below
Sorting:
- A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using th…☆116Updated 2 years ago
- Start a data science project with modern tools☆201Updated 2 years ago
- Phi_K correlation analyzer library☆166Updated last week
- Notebooks that support blog posts and tech talks on Dask / Coiled.☆47Updated 7 months ago
- pipreqs with jupyter notebook support☆70Updated 2 years ago
- In which I put together my thoughts on the practice of data science.☆302Updated 2 years ago
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆132Updated last year
- An abstraction layer for parameter tuning☆35Updated last year
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- Tutorials on creating a reproducible and maintainable data science project☆148Updated 3 years ago
- Sensible multi-core apply function for Pandas☆88Updated this week
- ☆150Updated 2 years ago
- Data Analysis Baseline Library☆133Updated 11 months ago
- How to Interpret SHAP Analyses: A Non-Technical Guide☆56Updated 3 years ago
- Get started DVC project (NLP, random forest)☆183Updated last year
- Clustergram - Visualization and diagnostics for cluster analysis in Python☆127Updated 3 months ago
- big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 5 years ago
- This Repository contains the material for the tutorial "Introduction to MLOps with MLflow" held at pyData/pyCon Berlin 2022.☆23Updated 3 years ago
- A library for debugging/inspecting machine learning classifiers and explaining their predictions☆311Updated 5 months ago
- A curated list of Python libraries used for data science.☆90Updated last year
- PyData London 2022 Tutorial☆67Updated 3 years ago
- Source for the PyViz.org website.☆182Updated last week
- Tutorial for implementing data validation in data science pipelines☆33Updated 3 years ago
- Convert from Python script to Jupyter notebook and vice versa☆127Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last year
- 📈🔍 Lets Python do AB testing analysis.☆78Updated 5 months ago
- Wrap-up to automatically tune xgboost in Python.☆80Updated 4 years ago
- Scikit-Learn API wrapper for Keras.☆251Updated 9 months ago
- ☆101Updated last week
- Investigation for PyDataLondon 2023 and ODSC 2023 conference comparing Pandas 2, Polars and Dask☆11Updated last year