DIYBigData / spark-data-analysis-projectsLinks
A collection of data analysis projects done using PySpark via Jupyter notebooks.
☆10Updated 2 years ago
Alternatives and similar repositories for spark-data-analysis-projects
Users that are interested in spark-data-analysis-projects are comparing it to the libraries listed below
Sorting:
- Pyspark in Google Colab: A simple machine learning (Linear Regression) model☆36Updated 6 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆16Updated 6 years ago
- ☆18Updated 7 years ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 3 years ago
- Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Updated 6 years ago
- Python Machine Learning (ML) project that demonstrates the archetypal ML workflow within a Jupyter notebook, with automated model deploym…☆62Updated 2 years ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆52Updated 4 years ago
- Spark and Python (PySpark) Examples☆39Updated 3 years ago
- Work for Mastering Large Datasets with Python☆19Updated 2 years ago
- A Scalable Data Cleaning Library for PySpark.☆27Updated 6 years ago
- Deep Learning with Apache Spark and Deep Cognition☆59Updated 6 years ago
- A repository for a PySpark Cookbook by Tomasz Drabas and Denny Lee☆59Updated 6 years ago
- Building simple ML apps with Streamlit☆24Updated 4 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆51Updated 8 years ago
- pyspark dataframe made easy☆16Updated 3 years ago
- My presentation at ODSC India 2018 about Deep Learning with Apache Spark☆27Updated 6 years ago
- Machine Learning and Data Analysis Case Studies using Spark.☆72Updated 4 years ago
- Iowa House Prices Kaggle (top 5%)☆13Updated 11 months ago
- Machine learning and process automation☆137Updated 2 years ago
- Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four p…☆39Updated 4 years ago
- My applied big data analytic project with pyspark.☆10Updated 2 years ago
- Learning Machine Learning and showcasing my work for 100 Days.☆16Updated 6 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Updated 5 years ago
- A lightweight benchmark utility for PySpark☆17Updated 5 years ago
- Data Science Quick Tips Repository!☆47Updated last year
- ☆13Updated 4 years ago
- ☆26Updated 5 years ago
- Build end-to-end Machine Learning pipeline to predict accessibility of playgrounds in NYC☆15Updated 4 years ago
- Data analysis using numpy, pandas, matplotlib, seaborn, sqlite3, data wrangling☆31Updated 5 years ago
- Resources for Data Science Kick Starter Workshop at ODSC India 2019☆20Updated 4 years ago