DIYBigData / spark-data-analysis-projects
A collection of data analysis projects done using PySpark via Jupyter notebooks.
☆10Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for spark-data-analysis-projects
- Spark and Python (PySpark) Examples☆39Updated 3 years ago
- ☆19Updated 6 years ago
- Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time☆67Updated 7 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆16Updated 6 years ago
- Python Machine Learning (ML) project that demonstrates the archetypal ML workflow within a Jupyter notebook, with automated model deploym…☆60Updated last year
- PySpark Code for Hands-on Learners☆114Updated 5 years ago
- Deep Learning with Apache Spark and Deep Cognition☆58Updated 6 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆52Updated 8 years ago
- Jupyter notebooks for pyspark tutorials given at University☆104Updated 2 months ago
- Pyspark in Google Colab: A simple machine learning (Linear Regression) model☆36Updated 5 years ago
- Source code for 'Building Machine Learning and Deep Learning Models on Google Cloud Platform'☆37Updated 5 years ago
- Companion Notebooks and Data for Data Science with Python and Dask from Manning Publications☆52Updated 4 years ago
- Work for Mastering Large Datasets with Python☆18Updated last year
- Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Updated 5 years ago
- Analytics projects using Big Data eco-systems (Hadoop, Spark, Storm)☆16Updated 2 years ago
- Because its never late to start taking notes and 'public' it...☆60Updated 3 weeks ago
- Insight Data Engineering Project☆15Updated 3 years ago
- PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2☆83Updated 4 years ago
- PySpark, Databrick, h2o, MLlib☆18Updated 8 years ago
- Code to 1) scrap wikipedia page view counts, and to 2) conduct time series analysis with GAM☆47Updated 7 years ago
- pyspark dataframe made easy☆16Updated 2 years ago
- Use Kafka and Apache Spark streaming to perform click stream analytics☆76Updated 4 years ago
- A repo to track data engineering projects☆13Updated 2 years ago
- Apache Spark Interview Question and Answers☆21Updated 4 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- My presentation at ODSC India 2018 about Deep Learning with Apache Spark☆27Updated 6 years ago
- The demo of using Kafka, Spark, Hive, Cassandra, etc by using Docker. It produces the production ready environment for any kinds of big d…☆31Updated 5 years ago
- ☆16Updated last year
- Counting Tweets Per User in Real-Time☆41Updated 7 years ago
- Insurance fraud claims analysis project☆49Updated 10 months ago