garystafford / pyspark-setup-demoLinks
Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
☆35Updated 4 years ago
Alternatives and similar repositories for pyspark-setup-demo
Users that are interested in pyspark-setup-demo are comparing it to the libraries listed below
Sorting:
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆120Updated 4 years ago
 - Repository of sample Databricks notebooks☆269Updated last year
 - scaffold of Apache Airflow executing Docker containers☆86Updated 2 years ago
 - Airflow training for the crunch conf☆104Updated 7 years ago
 - Data validation library for PySpark 3.0.0☆33Updated 2 years ago
 - Workshop for Spark and Databricks☆54Updated 5 years ago
 - An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆175Updated 5 months ago
 - spark on kubernetes☆104Updated 2 years ago
 - Interactive Notebooks that support the book☆40Updated 4 years ago
 - My Study guide used to pass the CRT020 Spark Certification exam☆34Updated 5 years ago
 - Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Updated 6 years ago
 - Repository used for Spark Trainings☆54Updated 2 years ago
 - Real-world Spark pipelines examples☆84Updated 7 years ago
 - Create HTML profiling reports from Apache Spark DataFrames☆198Updated 5 years ago
 - Repo for all my code on the articles I post on medium☆107Updated 3 years ago
 - MLFlow Spark Summit 2019 Presentation☆67Updated 6 years ago
 - A boilerplate for writing PySpark Jobs☆394Updated last year
 - Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 9 months ago
 - Databricks - Apache Spark™ - 2X Certified Developer☆265Updated 5 years ago
 - An example PySpark project with pytest☆17Updated 8 years ago
 - Example unit tests for Apache Spark Python scripts using the py.test framework☆84Updated 9 years ago
 - Jupyter kernel for scala and spark☆190Updated last year
 - Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
 - Airflow basics tutorial☆397Updated 4 years ago
 - Use Airflow to move data from multiple MySQL databases to BigQuery☆100Updated 5 years ago
 - HandySpark - bringing pandas-like capabilities to Spark dataframes☆196Updated 6 years ago
 - Code Repository for the EVO-ODAS☆32Updated 7 years ago
 - Spark on Kubernetes using Helm☆34Updated 5 years ago
 - ☆202Updated 2 years ago
 - Cloud Dataproc: Samples and Utils☆205Updated 4 months ago