A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
☆121Nov 20, 2022Updated 3 years ago
Alternatives and similar repositories for data-science-at-scale
Users that are interested in data-science-at-scale are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Jul 12, 2021Updated 4 years ago
- e-Rum2020::A Unified Approach For Writing Automatic Reports☆20Jun 19, 2020Updated 5 years ago
- Python implementation of Gibbs sampling for the naı̈ve Bayes model presented by Resnik and Hardisty☆14Feb 10, 2018Updated 8 years ago
- The ecosystem of geospatial machine learning tools in the Pangeo world.☆12Mar 17, 2025Updated last year
- Cubed-Sphere data processing with xarray☆18Jan 16, 2020Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆15Jun 2, 2022Updated 3 years ago
- This repository contains the resources used for presentation/discussion in weekly iRE Lab meetings.☆14Sep 8, 2017Updated 8 years ago
- A xarray extension to show velocity fields as interactive maps in jupyterlab☆12Dec 2, 2020Updated 5 years ago
- ☆53Apr 4, 2026Updated last month
- nyhackr website written using RMarkdown☆10Updated this week
- version 4.x of the Princeton Geniza Project☆12May 11, 2026Updated last week
- A small utility repo for checkerboard sampling☆10Jul 28, 2025Updated 9 months ago
- Gibbs sampler for for a Naive Bayes document classifier☆24Dec 15, 2012Updated 13 years ago
- ☆115Nov 7, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆12Apr 20, 2021Updated 5 years ago
- JupyterHub deployment for ENGR101 Winter 2018 at Portland Community College☆11Dec 8, 2022Updated 3 years ago
- Deep Learning from Scratch with PyTorch☆121Jul 10, 2020Updated 5 years ago
- Materials for the "Recommender Systems through the lens of Decision Theory" tutorial delivered at the 30th Web Conference (WWW '21).☆11Apr 13, 2021Updated 5 years ago
- Simple examples of data pipelines from xarray to ML training☆22Dec 19, 2019Updated 6 years ago
- A High-Performance Data Science Toolkit for the Earth Sciences☆72Jun 8, 2024Updated last year
- POS tagging models for Hindi English Code Mixed Tweets☆11Aug 1, 2018Updated 7 years ago
- 'math+econ+code' masterclass on equilibrium transport and matching models in economics☆36Jun 15, 2023Updated 2 years ago
- Central repository for xarray-contrib organization☆11Aug 26, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Example of using AWS for serverless on-demand seismic processing.☆14Mar 30, 2021Updated 5 years ago
- Materials for MIT workshop "Practical Computing Tutorials for Earth Scientists"☆39Apr 24, 2020Updated 6 years ago
- Earth System Model Collection specification☆13Feb 3, 2023Updated 3 years ago
- An IPython notebook analysis of the UWC Tampines commercial building dataset☆13Apr 25, 2019Updated 7 years ago
- Easy to use Python library of customized functions for cleaning and analyzing data.☆522Apr 14, 2026Updated last month
- Unmap data from a pseudocolor image, with or without knowing the colormap.☆18Apr 4, 2023Updated 3 years ago
- This repo contains a short version of a dask tutorial.☆12Dec 5, 2022Updated 3 years ago
- Python package to call processed EE objects via the REST API to local data☆36Jun 8, 2024Updated last year
- ☆14Nov 7, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Course site for MACS 30250 (Spring 2020) - Perspectives on Computational Research in Economics☆18Jun 3, 2020Updated 5 years ago
- Au Naturel is a LaTeX template built on top of the standard article class, roughly emulating some characteristics of the Nature Publishin…☆10May 2, 2018Updated 8 years ago
- Models and parameterizations for the turbulent ocean surface boundary layer in Julia☆25Dec 1, 2022Updated 3 years ago
- ⛔️ DEPRECATED GPU Ocean Python/CUDA codebase☆11Nov 9, 2023Updated 2 years ago
- ☆21Sep 29, 2021Updated 4 years ago
- Yelmo ice-sheet model code base☆20Updated this week
- Deploy a production scale datacube cluster on AWS using EKS☆22Feb 25, 2026Updated 2 months ago