dziganto / dziganto.github.io
☆25Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for dziganto.github.io
- Installation guide for Apache Spark + Hadoop on Mac/Linux☆58Updated 7 years ago
- Workshop for Spark and Databricks☆54Updated 4 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 8 years ago
- Repository used for Spark Trainings☆53Updated last year
- MLinProduction SageMaker workshop hosted in April 2020☆15Updated 4 years ago
- notebooks for nlp-on-spark☆13Updated 7 years ago
- A simple introduction to using spark ml pipelines☆26Updated 6 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- Mastering Spark for Data Science, published by Packt☆46Updated last year
- An example PySpark project with pytest☆17Updated 7 years ago
- Code example to predict prices of Airbnb vacation rentals, using scikit-learn on Spark with spark-sklearn, on MapR.☆44Updated 8 years ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 4 years ago
- Demonstration code for MLeap, both Jupyter notebooks and projects☆24Updated 5 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆100Updated 5 years ago
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 5 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- Sample techniques for a variety of feature extraction methods☆32Updated 3 years ago
- A couple projects using scikit-learn illustrating project decision making.☆15Updated 8 years ago
- ☆16Updated last year
- PySpark phonetic and string matching algorithms☆35Updated 8 months ago
- Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four p…☆38Updated 3 years ago
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆83Updated 5 years ago
- Deep Learning with Apache Spark and Deep Cognition☆58Updated 6 years ago
- PySpark Machine Learning Examples☆44Updated 6 years ago
- Feature Engineering with Pipeline Talk at ODSC West 2016, Santa Clara☆17Updated 8 years ago
- Code that goes along with https://humansofdata.atlan.com/2018/06/apache-airflow-disease-outbreaks-india/☆24Updated last year
- AWS Big Data Certification☆25Updated last year
- HandySpark - bringing pandas-like capabilities to Spark dataframes☆188Updated 5 years ago