A comprehensive Spark guide collated from multiple sources that can be referred to learn more about Spark or as an interview refresher.
☆685Apr 22, 2022Updated 3 years ago
Alternatives and similar repositories for SparkLearning
Users that are interested in SparkLearning are comparing it to the libraries listed below
Sorting:
- ☆19Jun 22, 2022Updated 3 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆840Apr 16, 2022Updated 3 years ago
- Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tupl…☆816Aug 10, 2025Updated 6 months ago
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆16Mar 18, 2022Updated 3 years ago
- Roadmap to becoming a data engineer in 2021☆12,745Jan 25, 2022Updated 4 years ago
- Data Engineering Practice Problems☆2,547Jan 8, 2025Updated last year
- The Data Engineering Cookbook☆14,959Jan 17, 2026Updated last month
- Example end to end data engineering project.☆1,387Dec 8, 2022Updated 3 years ago
- A list of useful resources to learn Data Engineering from scratch☆3,952Jun 19, 2024Updated last year
- Trident provides an easy way to pass the output of one command to any number of targets.☆34Sep 26, 2021Updated 4 years ago
- 🐺 Deploy Databases and Services Easily for Development and Testing Pipelines.☆727Updated this week
- ☆42Nov 19, 2021Updated 4 years ago
- A cookbook with the best practices to working with kubernetes.☆1,476Jan 6, 2026Updated last month
- Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Jo…☆38,735Updated this week
- PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster☆488Oct 15, 2024Updated last year
- Implementing best practices for PySpark ETL jobs and applications.☆2,074Jan 1, 2023Updated 3 years ago
- Accumulated knowledge and experience in the field of Data Engineering☆871Nov 22, 2022Updated 3 years ago
- 📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.☆28,698Jul 18, 2024Updated last year
- Query language for efficient data extraction from Wikipedia☆348Feb 16, 2022Updated 4 years ago
- Reference python implementation of Chia pool operations for pool operators☆434Jan 5, 2026Updated last month
- A curated list of references for MLOps☆13,714Nov 21, 2024Updated last year
- A Data Engineering & Machine Learning Knowledge Hub☆1,140Feb 2, 2024Updated 2 years ago
- 🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.☆3,139Feb 10, 2026Updated 2 weeks ago
- Always know what to expect from your data.☆11,162Feb 20, 2026Updated last week
- https://huyenchip.com/ml-interviews-book/☆4,529Mar 21, 2025Updated 11 months ago
- Contain Interview Questions Solutions☆12May 18, 2018Updated 7 years ago
- For my midterm project of the Machine Learning Zoomcamp, I decided to work in the Open Bioinformatics Research Project proposed by Data P…☆10Nov 2, 2021Updated 4 years ago
- Dead easy interface for executing many HTTP requests asynchronously. Also provides helper functions for executing embarrassingly parallel…☆385Mar 23, 2021Updated 4 years ago
- This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring…☆1,226Sep 8, 2025Updated 5 months ago
- A data augmentations library for audio, image, text, and video.☆5,071Feb 13, 2026Updated 2 weeks ago
- Deploy a ML inference service on a budget in less than 10 lines of code.☆1,344Feb 12, 2024Updated 2 years ago
- The Ultimate FREE Machine Learning Study Plan☆3,165Jun 11, 2024Updated last year
- Otto makes machine learning an intuitive, natural language experience. 🏆 Facebook AI Hackathon winner ⭐️ #1 Trending on MadeWithML.com …☆960Mar 6, 2023Updated 2 years ago
- A collection of research papers and software related to explainability in graph machine learning.☆1,985Apr 4, 2022Updated 3 years ago
- A whole new world of 300+ developer cheatsheets (discontinued)☆968Mar 1, 2023Updated 2 years ago
- Audio Editor☆892Jan 17, 2026Updated last month
- Automatic test case generation for python and static analysis library☆264Mar 28, 2022Updated 3 years ago
- Personal Data Engineering Projects☆993Feb 8, 2023Updated 3 years ago
- ☆93Sep 14, 2022Updated 3 years ago