Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple DataFrames, visualization, Machine Learning
☆30Aug 26, 2020Updated 5 years ago
Alternatives and similar repositories for pySpark_tutorial
Users that are interested in pySpark_tutorial are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Oct 21, 2020Updated 5 years ago
- Tutorial for Topic Modelling using PySpark and Spark NLP☆16May 29, 2020Updated 5 years ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics …☆20Nov 12, 2021Updated 4 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- Recency, Frequency, and Monetary are three behavioral attributes and are quite simple, in that they can be easily computed for any databa…☆15Nov 20, 2025Updated 6 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)☆109Jun 11, 2024Updated last year
- A tf.keras implementation of DCGAN to generate images of new Pokemon☆11Feb 2, 2023Updated 3 years ago
- PySpark Code for Hands-on Learners☆117Nov 3, 2019Updated 6 years ago
- Tutorial: How to Use Alembic in Python for database migration☆18May 25, 2024Updated last year
- Flask 로 API 를 만들기 위한 튜토리얼☆10Jun 22, 2020Updated 5 years ago
- Learning and Processing over Networks workshop AMLD 2019☆28May 20, 2022Updated 4 years ago
- ☆10Sep 17, 2022Updated 3 years ago
- Coefficient of Variation (CV) and Coefficient of Quartile Variation (CQV) with Confidence Intervals (CI)☆10May 7, 2026Updated 2 weeks ago
- Generative Adversarial Networks☆10Feb 2, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Ce descriptif couvre : 🏗️ Infrastructure : Terraform + GCP 🔒 Sécurité : VPC privé 🌐 Réseau : Gateway GCP, firewall 🎯 Composants : Obs…☆38Oct 21, 2025Updated 7 months ago
- Natural Language Processing with Flair, published by Packt☆26Mar 2, 2026Updated 2 months ago
- Land use determination and urbanization over time from landsat images☆13Nov 15, 2017Updated 8 years ago
- The proposed solution shows and approach to unify and centralize logs across different compute platforms like EC2, ECS, EKS and Lambda wi…☆14Oct 17, 2023Updated 2 years ago
- Random Forest Regression☆25Jun 1, 2018Updated 7 years ago
- Official Repository of Six Dragons Fly Again (ISMIR 2024)☆15Nov 13, 2025Updated 6 months ago
- The dataset contains Wikipedia comments which have been labeled by human raters for toxic behavior.☆11Jun 20, 2020Updated 5 years ago
- 🐍 Quick reference guide to common patterns & functions in PySpark.☆677Feb 21, 2023Updated 3 years ago
- CJOBS☆16Dec 22, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Personalized and Interactive Music Recommendation with Bandit approach☆11Sep 15, 2019Updated 6 years ago
- The accompanying repo for the hyperparameters optimization bdx meetup talk, blog post and webinar☆11Feb 1, 2017Updated 9 years ago
- ☆12Sep 13, 2018Updated 7 years ago
- ☆26Sep 4, 2018Updated 7 years ago
- Here is a conglomeration of file depcting the code we wrote to create an autonomous drone using a CNN-LSTM model to aid in food and packa…☆11Oct 30, 2020Updated 5 years ago
- ☆17Aug 31, 2023Updated 2 years ago
- ☆36Feb 12, 2026Updated 3 months ago
- Supporting tools for the Applied Time Series Analysis and Forecasting book☆10Jun 30, 2025Updated 10 months ago
- Create graphs of cumulative cases over cumulative deaths for COVID-19☆12May 3, 2020Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Apr 13, 2026Updated last month
- Building a simple facemask detector using Deep Learning(Keras) and OpenCV☆10Mar 25, 2023Updated 3 years ago
- Pointax: PointMaze Environment for JAX☆28Oct 22, 2025Updated 7 months ago
- Bluez-Dubbing: A Modular End-to-End Multilingual AI System for Automatic Video Translation☆29May 9, 2026Updated last week
- Dragon ball REST API☆18Oct 27, 2023Updated 2 years ago
- Info for the Linux Academy AWS Cloud Practitioner Study group!☆14Sep 25, 2019Updated 6 years ago
- This repository will contain a demo using Weaviate with data and metadata from the arXiv dataset.☆15Mar 8, 2022Updated 4 years ago