A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
☆121Nov 20, 2022Updated 3 years ago
Alternatives and similar repositories for data-science-at-scale
Users that are interested in data-science-at-scale are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Jul 12, 2021Updated 4 years ago
- Python implementation of Gibbs sampling for the naı̈ve Bayes model presented by Resnik and Hardisty☆14Feb 10, 2018Updated 8 years ago
- The ecosystem of geospatial machine learning tools in the Pangeo world.☆12Mar 17, 2025Updated last year
- Cubed-Sphere data processing with xarray☆18Jan 16, 2020Updated 6 years ago
- ☆14Jun 2, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A xarray extension to show velocity fields as interactive maps in jupyterlab☆12Dec 2, 2020Updated 5 years ago
- This is a repository of code and datasets for blog posts or articles I've written.☆12Feb 1, 2019Updated 7 years ago
- ☆12Jan 18, 2019Updated 7 years ago
- Pangeo Forge public roadmap☆18May 22, 2024Updated 2 years ago
- ☆53May 25, 2026Updated 2 weeks ago
- nyhackr website written using RMarkdown☆10Updated this week
- version 4.x of the Princeton Geniza Project☆12May 28, 2026Updated last week
- Python interface to TileDB Cloud REST API☆15May 5, 2026Updated last month
- A small utility for generating ND array pyramids using Xarray and Zarr.☆118May 1, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆115Nov 7, 2022Updated 3 years ago
- ☆12Apr 20, 2021Updated 5 years ago
- Module ENVS456 - University of Liverpool☆12Jan 16, 2017Updated 9 years ago
- JupyterHub deployment for ENGR101 Winter 2018 at Portland Community College☆11Dec 8, 2022Updated 3 years ago
- Deep Learning from Scratch with PyTorch☆121Jul 10, 2020Updated 5 years ago
- A place to provide Coiled feedback☆29Mar 5, 2025Updated last year
- Materials for the "Recommender Systems through the lens of Decision Theory" tutorial delivered at the 30th Web Conference (WWW '21).☆11Apr 13, 2021Updated 5 years ago
- EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots☆12Dec 15, 2020Updated 5 years ago
- Simple examples of data pipelines from xarray to ML training☆22Dec 19, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repository replicates the figures from the 3rd edition of the book "Recursive Macroeconomic Theory" by Lars Ljungqvist and Thomas J.…☆12Feb 9, 2016Updated 10 years ago
- ☆33Aug 14, 2020Updated 5 years ago
- A High-Performance Data Science Toolkit for the Earth Sciences☆72Jun 8, 2024Updated 2 years ago
- POS tagging models for Hindi English Code Mixed Tweets☆11Aug 1, 2018Updated 7 years ago
- 'math+econ+code' masterclass on equilibrium transport and matching models in economics☆36Jun 15, 2023Updated 2 years ago
- The course material for the programming course in DEES, University of Manchester☆10Jan 6, 2026Updated 5 months ago
- Scripts and other artifacts for MODIS data ingestion into Amazon public hosting.☆14Jun 1, 2021Updated 5 years ago
- A list of available and reserved slots for satRdays☆15May 19, 2021Updated 5 years ago
- CS 489/698 Big Data Infrastructure (Winter 2017) at the University of Waterloo☆15Apr 17, 2017Updated 9 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Materials for MIT workshop "Practical Computing Tutorials for Earth Scientists"☆39Apr 24, 2020Updated 6 years ago
- Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances☆12Aug 14, 2022Updated 3 years ago
- An IPython notebook analysis of the UWC Tampines commercial building dataset☆13Apr 25, 2019Updated 7 years ago
- Easy to use Python library of customized functions for cleaning and analyzing data.☆521Apr 14, 2026Updated last month
- Unmap data from a pseudocolor image, with or without knowing the colormap.☆18Apr 4, 2023Updated 3 years ago
- Python package to call processed EE objects via the REST API to local data☆36Jun 8, 2024Updated 2 years ago
- ☆19Feb 27, 2025Updated last year