runawayhorse001 / learning-apache-sparkLinks
☆17Updated 7 years ago
Alternatives and similar repositories for learning-apache-spark
Users that are interested in learning-apache-spark are comparing it to the libraries listed below
Sorting:
- Hey this is the repo that has all the queries and data for my video game training series!☆149Updated 3 years ago
- [DEPRECATED] Demo repository implementing an end-to-end MLOps workflow on Databricks. Project derived from dbx basic python template☆114Updated 2 years ago
- Guide for databricks spark certification☆58Updated 4 years ago
- The data science project used in my Datacamp course Unit Testing for Data Science in Python☆143Updated 2 years ago
- A Data Engineering & Machine Learning Knowledge Hub☆1,133Updated last year
- A tutorial for the Great Expectations library.☆71Updated 4 years ago
- 🧱 A collection of supplementary utilities and helper notebooks to perform admin tasks on Databricks☆56Updated last month
- The official repository for the Rock the JVM Spark Optimization with Scala course☆58Updated last year
- Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution☆69Updated 5 years ago
- The official repository for the Rock the JVM Spark Optimization 2 course☆40Updated last year
- Spark style guide☆260Updated 10 months ago
- This is repository of my YouTube Course on End to End Apache Spark in AIEngineering YouTube Channel☆189Updated 4 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- Data engineering interviews Q&A for data community by data community☆64Updated 5 years ago
- Just starting your DE journey or along the way already?. I will be sharing a short list of DATA-ENGINEERING-CENTRED books that covers the…☆34Updated 3 years ago
- O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian☆219Updated 2 years ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆88Updated 4 years ago
- PySpark Cheatsheet☆36Updated 2 years ago
- ☆86Updated 2 years ago
- ☆35Updated 2 years ago
- PySpark test helper methods with beautiful error messages☆709Updated last week
- Example repo to kickstart integration with mlflow pipelines.☆77Updated 2 years ago
- Spark and Delta Lake Workshop☆22Updated 3 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- LearningApacheSpark☆245Updated last year
- Code repository for the "PySpark in Action" book☆206Updated 2 months ago
- Delta Lake examples☆227Updated 10 months ago
- Databricks - Apache Spark™ - 2X Certified Developer☆265Updated 5 years ago
- Example repo to create end to end tests for data pipeline.☆25Updated last year
- This repo contains commands that data engineers use in day to day work.☆61Updated 2 years ago