anindya-saha / Data-Science-with-Spark
Machine Learning and Data Analysis Case Studies using Spark.
☆72Updated 3 years ago
Alternatives and similar repositories for Data-Science-with-Spark:
Users that are interested in Data-Science-with-Spark are comparing it to the libraries listed below
- Project work for Udacity's AB Testing Course☆82Updated 7 years ago
- PyCon SG 2016 - Customer Segmentation in Python☆56Updated 8 years ago
- Learn Machine Learning using PySpark from scratch☆19Updated 6 years ago
- a curated list of R tutorials for Data Science, NLP and Machine Learning☆23Updated 8 years ago
- ☆14Updated 7 years ago
- Churn Prediction with PySpark using MLlib and ML Packages☆56Updated 8 years ago
- Repository for sharing the knowledge from the learning path of Kaggle Learning. All contributions welcome :).☆149Updated 6 years ago
- Solutions to the book "Collection of Data Science TakeHome Challenges" in Python.☆10Updated 7 years ago
- PySpark Code for Hands-on Learners☆116Updated 5 years ago
- ☆77Updated 8 years ago
- Program assignments for the Deep Learning Specialization at Coursera by Andrew Ng☆51Updated 7 years ago
- Tips for Advanced Feature Engineering☆52Updated 4 years ago
- Source Code for 'Machine Learning with PySpark' by Pramod Singh☆113Updated 5 years ago
- Generic codes related to NLP☆84Updated 6 years ago
- Getting start with PySpark and MLlib☆297Updated 6 years ago
- My Solutions to "A Collection of Data Science Take-Home Challenges" by Giulio Palombo.☆78Updated 5 years ago
- This repository contains Spark, MLlib, PySpark and Dataframes projects☆43Updated 7 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Updated 5 years ago
- Data science blog☆33Updated 6 years ago
- Simple sentiment analysis model with PySpark☆42Updated 6 years ago
- PyCon 2017 tutorial on time series analysis☆72Updated 7 years ago
- This is the presentation on - What are the key points one should consider if they will be appearing in Data Science job interview☆40Updated 6 years ago
- ☆110Updated 8 years ago
- Customer life time analysis (CLV analysis). We are using Gamma-Gamma model to estimate average transaction value for each customer.☆45Updated 6 years ago
- Codes written for some competitions☆13Updated 8 years ago
- MLFlow Spark Summit 2019 Presentation☆67Updated 5 years ago
- Frank Kane's Taming Big Data with Apache Spark and Python, published by Packt☆118Updated 2 years ago
- Codes used for the hack session in DHS 2019☆53Updated 5 years ago
- Notes from different sources such as Harvard CS109 course, Springboard's Data Science Interview questions, Elements of Programming Interv…☆34Updated 4 years ago
- Contains code and presentation for my interactive hack session, 'Effective Feature Engineering: A Structured Approach to Building Better …☆30Updated 4 years ago