Code examples on Apache Spark using python
☆108Aug 11, 2022Updated 3 years ago
Alternatives and similar repositories for pyspark-examples
Users that are interested in pyspark-examples are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Nov 9, 2025Updated 6 months ago
- Fundamentals of Spark with Python (using PySpark), code examples☆363Oct 29, 2022Updated 3 years ago
- Apache Hadoop - Docker distribution based on CentOS 7 and Oracle Java 8☆12Feb 20, 2018Updated 8 years ago
- PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2☆89Jan 3, 2020Updated 6 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Jan 22, 2019Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Create LAMP Stack using terraform with AWS☆11Feb 15, 2023Updated 3 years ago
- Hadoop Examples☆10Jul 1, 2022Updated 3 years ago
- Add gevent support to DataStax Python Driver for Apache Cassandra☆11Jun 10, 2020Updated 5 years ago
- Ansible Playbook to create LAMP in CentOS 7 with Apache, MySQL, PHP.☆10Dec 28, 2018Updated 7 years ago
- Apache Spark (PySpark) Practice on Real Data☆272Jan 31, 2020Updated 6 years ago
- All Certification and preparation, examples & others☆11Oct 18, 2018Updated 7 years ago
- Pyspark RDD, DataFrame and Dataset Examples in Python language☆1,352Dec 7, 2025Updated 5 months ago
- Automated (Ansible) installation of HDP via Ambari Blueprint☆16Mar 10, 2017Updated 9 years ago
- ☆14Aug 24, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- AutoML Software designed to give users access to a whole plethora of ML models, some trainable on the GPU.☆14Oct 23, 2021Updated 4 years ago
- PySpark Code for Hands-on Learners☆117Nov 3, 2019Updated 6 years ago
- Utilities to Retrieve Rulelists from Model Fits, Filter, Prune, Reorder and Predict on unseen data☆11Feb 4, 2025Updated last year
- ☆13Oct 21, 2020Updated 5 years ago
- Docker Apache Airflow☆13Mar 1, 2023Updated 3 years ago
- A grocery buying app☆12Jul 24, 2021Updated 4 years ago
- ☆16Dec 23, 2021Updated 4 years ago
- Python API for Informatica PowerCenter (pmrep, pmcmd)☆21Sep 17, 2017Updated 8 years ago
- My Reusable Notes☆26Jun 25, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Dashboard to visualize the growth of coronavirus (plotly and dash)☆12May 22, 2023Updated 3 years ago
- Databricks - Apache Spark™ - 2X Certified Developer☆265Jul 24, 2020Updated 5 years ago
- Set of Shell scripts to automate Linux from Scratch, based on the book 7.8☆31Jan 10, 2018Updated 8 years ago
- Notes from 100 days with Kubernetes☆31Jan 25, 2019Updated 7 years ago
- All my projects on Big Data are provided☆27Dec 5, 2016Updated 9 years ago
- Complete Guide To Mastering Databricks☆44Feb 28, 2026Updated 2 months ago
- ☆11Jun 3, 2025Updated 11 months ago
- Kafka-Notes☆15Jun 20, 2021Updated 4 years ago
- Neural networks for machine learning☆17Oct 10, 2020Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Kirk's Zeppelin Notebooks☆11May 22, 2018Updated 8 years ago
- Ansible crash course☆39May 3, 2019Updated 7 years ago
- Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)☆11Jan 20, 2022Updated 4 years ago
- This repository contains code for Spark Streaming☆26Mar 11, 2021Updated 5 years ago
- [NOT MAINTAINED] Create an ElasticSearch cluster with a simple single bash command. Config through environment variables: RAM, cluster na…☆59Jan 26, 2018Updated 8 years ago
- Iot,Big Data Analytics using Apache-kafka,spark and other aws services☆16Sep 11, 2020Updated 5 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Aug 27, 2019Updated 6 years ago