MoranReznik / PySpark-Reference-Notebook
☆63Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for PySpark-Reference-Notebook
- ☆128Updated last year
- I will attempt to create my own spotify wrapped by collecting data from the spotify API, perform transformations and create informative d…☆74Updated last year
- ☆30Updated last year
- ☆27Updated last year
- My current data engineering portfolio. Includes projects spanning ETL, orchestration and dashboarding.☆103Updated 7 months ago
- My repo for the Machine Learning Engineering bootcamp 2022 by DataTalks.Club☆21Updated last year
- ☆23Updated last year
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆70Updated 6 months ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆37Updated last year
- Final Project of the MLOps Zoomcamp hosted by DataTalksClub.☆25Updated last year
- For this project I am creating an ETL (Extract, Transform, and Load) pipeline using Python, RegEx, and SQL Database. The goal is to retri…☆25Updated 3 years ago
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆128Updated last year
- YouTube tutorial project☆94Updated last year
- ☆31Updated 2 years ago
- ☆41Updated last year
- Hey this is the repo that has all the queries and data for my video game training series!☆132Updated 2 years ago
- ☆57Updated 3 years ago
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆99Updated 3 years ago
- Sample project to demonstrate data engineering best practices☆166Updated 8 months ago
- ☆86Updated 2 years ago
- ☆18Updated 10 months ago
- Public data and analytics for our open course☆30Updated 7 months ago
- PySpark Projects☆21Updated 3 weeks ago
- Solutions on Practical Data Science Specialization on Coursera (offered by deeplearning.ai)☆59Updated 3 years ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated 11 months ago
- I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that uti…☆28Updated last year
- Price Crawler - Tracking Price Inflation☆184Updated 4 years ago
- Mastering Big Data Analytics with PySpark, Published by Packt☆156Updated 3 months ago
- In this project, we will build and ETL(Extract,Transform,Load) pipeline using the Spotify API on AWS. The pipeline will retrieve data fro…☆21Updated last year