tmcgrath / spark-with-python-course
Contains source files used in the Spark with Python course
☆18Updated 5 years ago
Alternatives and similar repositories for spark-with-python-course:
Users that are interested in spark-with-python-course are comparing it to the libraries listed below
- Mastering Spark for Data Science, published by Packt☆47Updated 2 years ago
- Spark and Python (PySpark) Examples☆40Updated 3 years ago
- Apache Spark in 7 Days [Video], by Packt Publishing☆17Updated 2 years ago
- Business Data Analysis by HiPIC of CalStateLA☆20Updated 6 years ago
- Repository used for Spark Trainings☆53Updated last year
- Scala and Spark for Big Data Analytics, published by Packt☆35Updated last year
- ☆37Updated 8 years ago
- ☆26Updated last year
- Batch Processing , orchestration using Apache Airflow and Google Workflows, spark structured Streaming and a lot more☆19Updated 2 years ago
- Code repository for Large Scale Machine Learning with Spark by Packt☆20Updated 2 years ago
- AWS Big Data Certification☆25Updated 2 weeks ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- Learning PySpark video series☆11Updated 6 years ago
- ☆16Updated last year
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- Spark Projects for the Berkeley Data Science Course☆11Updated 9 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 9 years ago
- Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Updated 6 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 8 years ago
- ☆15Updated 7 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Frank Kane's Taming Big Data with Apache Spark and Python, published by Packt☆118Updated 2 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Study notes and demos.☆12Updated 11 months ago
- Data Science and Machine Learning with Python - Hands On from Udemy☆14Updated 7 years ago
- PySpark Cookbook, published by Packt☆90Updated 2 years ago
- An example PySpark project with pytest☆17Updated 7 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Hands-On Data Analysis with Scala, published by Packt☆20Updated 2 years ago