jadianes / spark-py-notebooksView external linksLinks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
☆1,667Mar 16, 2024Updated last year
Alternatives and similar repositories for spark-py-notebooks
Users that are interested in spark-py-notebooks are comparing it to the libraries listed below
Sorting:
- An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset☆830Oct 6, 2021Updated 4 years ago
- Ways of doing Data Science Engineering and Machine Learning in R and Python☆620Apr 25, 2021Updated 4 years ago
- PySpark-Tutorial provides basic algorithms using PySpark☆1,273May 26, 2025Updated 8 months ago
- R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks☆123Sep 6, 2017Updated 8 years ago
- PySpark + Scikit-learn = Sparkit-learn☆1,154Dec 31, 2020Updated 5 years ago
- Apache Spark (PySpark) Practice on Real Data☆273Jan 31, 2020Updated 6 years ago
- Code snippets and tutorials for working with social science data in PySpark☆419Aug 11, 2017Updated 8 years ago
- Code base for the Learning PySpark book (in preparation)☆628Apr 16, 2019Updated 6 years ago
- Code repository for Learning PySpark by Packt☆342Jan 30, 2023Updated 3 years ago
- Fundamentals of Spark with Python (using PySpark), code examples☆362Oct 29, 2022Updated 3 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,861Oct 24, 2024Updated last year
- Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce,…☆28,857Mar 20, 2024Updated last year
- A free tutorial for Apache Spark.☆992Jan 5, 2026Updated last month
- Interactive and Reactive Data Science using Scala and Spark.☆3,151May 16, 2023Updated 2 years ago
- Learn the pyspark API through pictures and simple examples☆170Jan 23, 2021Updated 5 years ago
- Updated repository☆157Nov 25, 2021Updated 4 years ago
- Jupyter notebooks for pyspark tutorials given at University☆110Jan 7, 2026Updated last month
- Getting start with PySpark and MLlib☆300May 7, 2018Updated 7 years ago
- Jupyter magics and kernels for working with remote Spark clusters☆1,363Sep 9, 2025Updated 5 months ago
- Jupyter notebooks from the scikit-learn video series☆3,779Mar 5, 2024Updated last year
- The "Python Machine Learning (1st edition)" book code repository and info resource☆12,585Nov 20, 2024Updated last year
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,541Dec 2, 2024Updated last year
- TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.☆3,859Jul 10, 2023Updated 2 years ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,074Jan 1, 2023Updated 3 years ago
- Repository of teaching materials, code, and data for my data analysis and machine learning projects.☆6,612Jun 21, 2023Updated 2 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52May 13, 2016Updated 9 years ago
- A wine recommender system tutorial using Python technologies such as Django, Pandas, or Scikit-learn, and others such as Bootstrap.☆347Mar 17, 2018Updated 7 years ago
- pyspark sample scripts☆16Jan 9, 2019Updated 7 years ago
- Notes on Apache Spark (pyspark)☆297Mar 3, 2019Updated 6 years ago
- Code to accompany Advanced Analytics with Spark from O'Reilly Media☆1,527Sep 25, 2024Updated last year
- OnLine Spectral Search ENgine for Proteomics big data using Apache Spark, Python/Flask, and AngularJS☆15Sep 14, 2015Updated 10 years ago
- Ready-to-run Docker images containing Jupyter applications☆8,412Feb 8, 2026Updated last week
- Distributed Deep learning with Keras & Spark☆1,578May 1, 2023Updated 2 years ago
- Apache Spark - A unified analytics engine for large-scale data processing☆42,810Updated this week
- TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)☆43,802Jul 26, 2024Updated last year
- A pure Python implementation of Apache Spark's RDD and DStream interfaces.☆270Sep 3, 2024Updated last year
- A collection of IPython notebooks covering various topics.☆2,609Oct 19, 2020Updated 5 years ago
- This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.☆10,382Dec 22, 2020Updated 5 years ago
- Tutorial on scikit-learn and IPython for parallel machine learning☆1,599Oct 4, 2016Updated 9 years ago