A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
☆120Nov 20, 2022Updated 3 years ago
Alternatives and similar repositories for data-science-at-scale
Users that are interested in data-science-at-scale are comparing it to the libraries listed below
Sorting:
- ☆13Jul 12, 2021Updated 4 years ago
- Introduction to Dask for PyTorch Workflows☆13Mar 3, 2021Updated 5 years ago
- Python implementation of Gibbs sampling for the naı̈ve Bayes model presented by Resnik and Hardisty☆14Feb 10, 2018Updated 8 years ago
- This repository contains the resources used for presentation/discussion in weekly iRE Lab meetings.☆14Sep 8, 2017Updated 8 years ago
- Gibbs sampler for for a Naive Bayes document classifier☆24Dec 15, 2012Updated 13 years ago
- ☆10Mar 14, 2020Updated 5 years ago
- A Panel app to demonstrate distorsions created by non-perceptual colormaps on geophysical data☆12Jan 22, 2026Updated last month
- Materials for the "Recommender Systems through the lens of Decision Theory" tutorial delivered at the 30th Web Conference (WWW '21).☆11Apr 13, 2021Updated 4 years ago
- A small utility repo for checkerboard sampling☆11Jul 28, 2025Updated 7 months ago
- This is a repository of code and datasets for blog posts or articles I've written.☆12Feb 1, 2019Updated 7 years ago
- JupyterHub deployment for ENGR101 Winter 2018 at Portland Community College☆11Dec 8, 2022Updated 3 years ago
- ☆12Apr 20, 2021Updated 4 years ago
- POS tagging models for Hindi English Code Mixed Tweets☆11Aug 1, 2018Updated 7 years ago
- Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances☆12Aug 14, 2022Updated 3 years ago
- A place to provide Coiled feedback☆29Mar 5, 2025Updated last year
- Example of using AWS for serverless on-demand seismic processing.☆14Mar 30, 2021Updated 4 years ago
- This repo contains a short version of a dask tutorial.☆12Dec 5, 2022Updated 3 years ago
- Python library for interacting with Dask clusters in Saturn☆12Sep 4, 2025Updated 6 months ago
- The ecosystem of geospatial machine learning tools in the Pangeo world.☆12Mar 17, 2025Updated 11 months ago
- An IPython notebook analysis of the UWC Tampines commercial building dataset☆13Apr 25, 2019Updated 6 years ago
- EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots☆12Dec 15, 2020Updated 5 years ago
- Comparison of Python packages and libraries for visualising geospatial vector data: applications for Smarter Cities.☆30Dec 3, 2022Updated 3 years ago
- ☆19Feb 27, 2025Updated last year
- Notebooks for Pangeo Showcase Talk on Oct 12, 2022☆14Oct 13, 2022Updated 3 years ago
- Demo of DuckDB Spark API implements. Same Pyspark code, but DuckDB under the hood☆15Nov 16, 2023Updated 2 years ago
- Gibbs sampling inference to LDA☆19Apr 4, 2014Updated 11 years ago
- ☆32Aug 14, 2020Updated 5 years ago
- A repo of Flyte-related conference talks☆14Feb 24, 2024Updated 2 years ago
- Module ENVS456 - University of Liverpool☆12Jan 16, 2017Updated 9 years ago
- Simple examples of data pipelines from xarray to ML training☆22Dec 19, 2019Updated 6 years ago
- ☆43Dec 9, 2025Updated 2 months ago
- A small utility for generating ND array pyramids using Xarray and Zarr.☆116Feb 1, 2026Updated last month
- This repo containse the demos and link to slides for the Ibis + DuckDB geospatial talk☆16Jul 11, 2024Updated last year
- Jupyter notebooks in support of the lecture course '3D7: Finite Element Methods'☆18Jan 25, 2021Updated 5 years ago
- A High-Performance Data Science Toolkit for the Earth Sciences☆71Jun 8, 2024Updated last year
- Easy to use Python library of customized functions for cleaning and analyzing data.☆521Updated this week
- Unmap data from a pseudocolor image, with or without knowing the colormap.☆18Apr 4, 2023Updated 2 years ago
- Cubed-Sphere data processing with xarray☆18Jan 16, 2020Updated 6 years ago
- Browsing multiple Zarr archives within a file system☆19May 16, 2024Updated last year