mahmoudparsian / data-algorithms-with-sparkView external linksLinks
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
☆228Jun 26, 2023Updated 2 years ago
Alternatives and similar repositories for data-algorithms-with-spark
Users that are interested in data-algorithms-with-spark are comparing it to the libraries listed below
Sorting:
- Machine Learning Course @ Santa Clara University☆24Jun 10, 2020Updated 5 years ago
- Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University☆166Dec 4, 2025Updated 2 months ago
- PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2☆88Jan 3, 2020Updated 6 years ago
- PySpark-Tutorial provides basic algorithms using PySpark☆1,273May 26, 2025Updated 8 months ago
- Examples for learning spark☆19Aug 19, 2015Updated 10 years ago
- Code repository for the "PySpark in Action" book☆212Jun 11, 2025Updated 8 months ago
- Python extension pack for Anaconda☆22Oct 10, 2018Updated 7 years ago
- 🐍 Quick reference guide to common patterns & functions in PySpark.☆652Feb 21, 2023Updated 2 years ago
- PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster☆488Oct 15, 2024Updated last year
- Anaconda plugin for StarCluster☆21Aug 14, 2024Updated last year
- ☆12Jun 23, 2016Updated 9 years ago
- Feature selection for machine learning using mutual information.☆15Dec 4, 2024Updated last year
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆104Dec 3, 2020Updated 5 years ago
- ☆10Oct 3, 2022Updated 3 years ago
- Pyspark RDD, DataFrame and Dataset Examples in Python language☆1,342Dec 7, 2025Updated 2 months ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,064Jan 1, 2023Updated 3 years ago
- ☆12Jul 12, 2021Updated 4 years ago
- Experimental plugin for scikit-learn to be able to run (some estimators) on Intel GPUs via numba-dpex.☆16Feb 28, 2024Updated last year
- Apache Hadoop 3 Quick Start Guide, published by Packt☆14Apr 14, 2023Updated 2 years ago
- Python library for deploying models built using Python to Alteryx Promote.☆15Dec 10, 2021Updated 4 years ago
- Code for the Apache Airflow Technical Essentials live training on O'Reilly☆32Sep 10, 2024Updated last year
- Hands-On Chatbot Development with Alexa Skills and Amazon Lex, published by Packt☆15Jan 30, 2023Updated 3 years ago
- Introduction to Dask for PyTorch Workflows☆13Mar 3, 2021Updated 4 years ago
- Azure Databricks Cookbook, Published by Packt☆57Jun 24, 2023Updated 2 years ago
- This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring…☆1,218Sep 8, 2025Updated 5 months ago
- 👋 Project done for @hankified https://helloish.com☆16Oct 18, 2019Updated 6 years ago
- pytest plugin that checks URLs☆18May 16, 2024Updated last year
- Azure Data Engineering Cookbook 2nd-edition, published by Packt☆35Sep 20, 2023Updated 2 years ago
- MapReduce, Spark, Java, and Scala for Data Algorithms Book☆1,084Oct 14, 2024Updated last year
- The source code for the book Modern Data Engineering with Apache Spark☆39Jul 26, 2022Updated 3 years ago
- Aspect oriented programming for Python. Patch everything!☆13Jan 7, 2019Updated 7 years ago
- TriScale software☆14Apr 23, 2024Updated last year
- ☆16Jul 5, 2021Updated 4 years ago
- Various useful data structures in Python☆39Nov 14, 2019Updated 6 years ago
- ☆19Dec 2, 2024Updated last year
- Code repository for IoT Building Arduino based Projects, published by Packt☆16Jan 14, 2021Updated 5 years ago
- Samples and documentation for various advertising and marketing use cases on AWS.☆36May 23, 2023Updated 2 years ago
- A simple Spark-powered ETL framework that just works 🍺☆183Oct 2, 2025Updated 4 months ago
- Mastering Big Data Analytics with PySpark, Published by Packt☆166Aug 20, 2024Updated last year