Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews
☆211Dec 31, 2025Updated 2 months ago
Alternatives and similar repositories for spark-experiments
Users that are interested in spark-experiments are comparing it to the libraries listed below
Sorting:
- In this project, we will build and ETL(Extract,Transform,Load) pipeline using the Spotify API on AWS. The pipeline will retrieve data fro…☆25May 6, 2023Updated 2 years ago
- Ravi Azure ADB ADF Repository☆64Jan 25, 2025Updated last year
- Here lies all the pieces of portfolio projects and documents that I have been harvesting throughout the journey of learning Data Analysis…☆11Nov 22, 2023Updated 2 years ago
- ☆10May 3, 2025Updated 10 months ago
- Contains spark dataframe solutions of leetcode questions☆24Dec 13, 2022Updated 3 years ago
- Build and run Spark Structured Streaming pipelines in Hadoop - project using PySpark.☆13Jun 6, 2019Updated 6 years ago
- ☆16May 23, 2025Updated 9 months ago
- I have tried to solve some complex SQL interview questions that had been asked in several company. Collected this question from Ankit Ban…☆102May 15, 2022Updated 3 years ago
- ☆17Jun 23, 2024Updated last year
- ☆19Sep 5, 2021Updated 4 years ago
- This repository contains my solutions to the top 50 LeetCode SQL challenges implemented using PySpark DataFrame and PySpark SQL.☆29Mar 16, 2024Updated last year
- ☆10May 3, 2021Updated 4 years ago
- Machine Learning Engineer interview preparation. Brushing up Data Structures & Algorithms, System Design and SQL☆24Jun 10, 2021Updated 4 years ago
- Learn PySpark from Basics to Advanced. Checkout the YouTube Series : [PySpark - Zero to Hero]☆133Sep 7, 2025Updated 6 months ago
- This project involves an ETL (Extract, Transform, Load) process to analyze sleep data exported from Apple Health☆29Apr 29, 2023Updated 2 years ago
- More than 2000+ Data engineer interview questions.☆1,531Jan 13, 2026Updated last month
- PySpark Projects☆27Feb 3, 2026Updated last month
- This repository focuses on providing interview scenario questions that I have encountered during interviews. The questions are designed t…☆48Feb 11, 2025Updated last year
- GitHub repository related to the course Mastering Elastic Map Reduce for Data Engineers☆24Jul 31, 2022Updated 3 years ago
- Apache Spark Interview Question and Answers☆21Oct 13, 2020Updated 5 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆857Apr 16, 2022Updated 3 years ago
- The official repository for the Rock the JVM Spark Optimization with Scala course☆58Dec 4, 2023Updated 2 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆28Jun 13, 2022Updated 3 years ago
- Master Big Data With PySpark and AWS☆132Jun 27, 2023Updated 2 years ago
- ☆29Jul 29, 2023Updated 2 years ago
- ☆92Dec 17, 2024Updated last year
- Data sets and ML models versioning example from DVC get started☆10Jun 4, 2024Updated last year
- ☆10Jun 21, 2021Updated 4 years ago
- Yolo v4☆31May 2, 2021Updated 4 years ago
- Learn various Algorithms of Machine Learning like SVC, Decision Tree , Random Forest , Logistic Regression, Linear Regression and much Mo…☆11Jul 31, 2019Updated 6 years ago
- ☆388Jan 26, 2025Updated last year
- Open Source LeetCode for PySpark, Spark, Pandas and DBT/Snowflake☆257Jun 27, 2025Updated 8 months ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,081Jan 1, 2023Updated 3 years ago
- ☆27Apr 26, 2020Updated 5 years ago
- Dockerized monitoring stack for Apache Airflow☆36Sep 8, 2024Updated last year
- This is a repo with links to everything you'd ever want to learn about data engineering☆40,451Feb 26, 2026Updated last week
- datacamp Data Engineer with Python course. 73 hours/ 19 Courses /2 Skill Assessments☆141Nov 29, 2022Updated 3 years ago
- ☆30Nov 16, 2023Updated 2 years ago
- ☆33Sep 29, 2020Updated 5 years ago