coiled / data-science-at-scale
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
☆115Updated 2 years ago
Alternatives and similar repositories for data-science-at-scale:
Users that are interested in data-science-at-scale are comparing it to the libraries listed below
- Jupyter Notebooks and other material from tutorial sessions on Machine Learning, Data Science, and related☆56Updated 3 years ago
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆77Updated last year
- Sample projects using Ploomber.☆86Updated last year
- PyData London 2022 Tutorial☆66Updated 2 years ago
- Deep Learning from Scratch with PyTorch☆116Updated 4 years ago
- Notebooks that support blog posts and tech talks on Dask / Coiled.☆47Updated 2 months ago
- Dask tutorial material for video tutorial series☆87Updated last year
- Repository for a workshop on Bayesian Decision Analysis☆69Updated 2 years ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆52Updated 4 years ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projects☆106Updated last year
- 🐍 Material for PyData Global 2021 Presentation: Effective Testing for Machine Learning Projects☆81Updated 3 years ago
- ☆29Updated 5 years ago
- Data Analysis Baseline Library☆132Updated 6 months ago
- Structural Time Series on US electricity demand data☆22Updated 4 years ago
- Increase citations, ease review & collaboration A collection of "easy wins" to make machine learning in research reproducible. This tut…☆74Updated 4 months ago
- ☆15Updated 3 years ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- It's all in the name☆77Updated last year
- One day workshop for machine learning with scikit-learn☆63Updated last year
- ☆133Updated 11 months ago
- Berlin Time Series Analysis Repository☆99Updated 2 years ago
- The fast.ai data ethics course☆16Updated 2 years ago
- HoloViz tutorial for KDD 2022☆35Updated 2 years ago
- Data manipulation, analysis and visualisation in Python - specialist course Doctoral schools of Ghent University☆107Updated 3 months ago
- Code samples for the Effective Data Science Infrastructure book☆115Updated last year
- Talks about vaex☆36Updated 2 years ago
- Python port of "Common statistical tests are linear models" by Jonas Kristoffer Lindeløv.☆91Updated 8 months ago
- Feature engineering package with sklearn like functionality☆54Updated 8 months ago
- Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.☆105Updated 6 years ago
- 📈 The panel-highcharts package makes it easy to use HighCharts in Python, Notebooks and with HoloViz Panel.☆155Updated 2 years ago