O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
☆230Jun 26, 2023Updated 2 years ago
Alternatives and similar repositories for data-algorithms-with-spark
Users that are interested in data-algorithms-with-spark are comparing it to the libraries listed below
Sorting:
- Machine Learning Course @ Santa Clara University☆24Jun 10, 2020Updated 5 years ago
- Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University☆166Dec 4, 2025Updated 3 months ago
- PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2☆88Jan 3, 2020Updated 6 years ago
- PySpark-Tutorial provides basic algorithms using PySpark☆1,272May 26, 2025Updated 9 months ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Apr 21, 2023Updated 2 years ago
- Haskell Cookbook, published by Packt☆25Jan 18, 2023Updated 3 years ago
- Code repository for the "PySpark in Action" book☆214Jun 11, 2025Updated 8 months ago
- Python extension pack for Anaconda☆22Oct 10, 2018Updated 7 years ago
- 🐍 Quick reference guide to common patterns & functions in PySpark.☆662Feb 21, 2023Updated 3 years ago
- Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR☆17Apr 27, 2025Updated 10 months ago
- FlaskRestful + Swagger UI + Docker Compose + Unit Test | How to organize Python Code for REST API☆14Jun 5, 2022Updated 3 years ago
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆104Dec 3, 2020Updated 5 years ago
- Talks from the UW Python for Geosciences Seminar☆12Mar 1, 2016Updated 10 years ago
- ☆12Jun 23, 2016Updated 9 years ago
- Pyspark RDD, DataFrame and Dataset Examples in Python language☆1,346Dec 7, 2025Updated 3 months ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,081Jan 1, 2023Updated 3 years ago
- Python library for deploying models built using Python to Alteryx Promote.☆15Dec 10, 2021Updated 4 years ago
- Experimental plugin for scikit-learn to be able to run (some estimators) on Intel GPUs via numba-dpex.☆16Feb 28, 2024Updated 2 years ago
- ☆12Jul 12, 2021Updated 4 years ago
- Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average…☆15Apr 5, 2022Updated 3 years ago
- Spark Time Series Set data analysis☆12Dec 14, 2020Updated 5 years ago
- Apache Hadoop 3 Quick Start Guide, published by Packt☆14Apr 14, 2023Updated 2 years ago
- Code for the Apache Airflow Technical Essentials live training on O'Reilly☆32Sep 10, 2024Updated last year
- Hands-On Chatbot Development with Alexa Skills and Amazon Lex, published by Packt☆15Jan 30, 2023Updated 3 years ago
- Git/Github Intro☆13Jun 17, 2015Updated 10 years ago
- Introduction to Dask for PyTorch Workflows☆13Mar 3, 2021Updated 5 years ago
- Source Code for 'Practical Haskell, 3rd Edition' by Alejandro Serrano Mena☆13Oct 11, 2022Updated 3 years ago
- Reinforcement Learning for Uplift Modeling☆13Mar 13, 2021Updated 4 years ago
- This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring…☆1,227Sep 8, 2025Updated 6 months ago
- 👋 Project done for @hankified https://helloish.com☆16Oct 18, 2019Updated 6 years ago
- Azure Data Engineering Cookbook 2nd-edition, published by Packt☆35Sep 20, 2023Updated 2 years ago
- MapReduce, Spark, Java, and Scala for Data Algorithms Book☆1,083Oct 14, 2024Updated last year
- The source code for the book Modern Data Engineering with Apache Spark☆39Jul 26, 2022Updated 3 years ago
- ☆16Jul 5, 2021Updated 4 years ago
- Aspect oriented programming for Python. Patch everything!☆13Jan 7, 2019Updated 7 years ago
- TriScale software☆14Apr 23, 2024Updated last year
- Code repository for IoT Building Arduino based Projects, published by Packt☆16Jan 14, 2021Updated 5 years ago
- A simple Spark-powered ETL framework that just works 🍺☆185Oct 2, 2025Updated 5 months ago
- Source code for 'Spring Boot Messaging' by Felipe Gutierrez☆38Sep 3, 2017Updated 8 years ago