whole-tale / all-spark-notebook
Jupyter Notebook with Spark support extracted from jupyter/docker-stack
☆18Updated 6 years ago
Alternatives and similar repositories for all-spark-notebook:
Users that are interested in all-spark-notebook are comparing it to the libraries listed below
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'☆119Updated last year
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Updated 2 years ago
- Microservices for Data science code for course☆8Updated 6 years ago
- ☆49Updated 3 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- Data Science Quick Tips Repository!☆47Updated last year
- A guide to show you how to import data for ETL☆20Updated 2 years ago
- Big Data Demystified meetup and blog examples☆31Updated 8 months ago
- Contains source files used in the Spark with Python course☆18Updated 6 years ago
- Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four p…☆35Updated 4 years ago
- Delta-Lake, ETL, Spark, Airflow☆47Updated 2 years ago
- Repository used for Spark Trainings☆53Updated 2 years ago
- Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four p…☆39Updated 4 years ago
- Zeppelin docker☆15Updated 4 years ago
- A repository for a PySpark Cookbook by Tomasz Drabas and Denny Lee☆59Updated 6 years ago
- Batch Processing , orchestration using Apache Airflow and Google Workflows, spark structured Streaming and a lot more☆19Updated 2 years ago
- Iowa House Prices Kaggle (top 5%)☆13Updated 10 months ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆25Updated 3 years ago
- ☆92Updated 2 years ago
- Apache Spark 3 - Structured Streaming Course Material☆121Updated last year
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Updated 3 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆38Updated 9 months ago
- ☆40Updated 7 years ago
- ☆25Updated last year
- PySpark Cookbook, published by Packt☆91Updated 2 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆47Updated last year
- ETL pipeline using pyspark (Spark - Python)☆115Updated 5 years ago
- Python and AirFlow - Data Pipeline Orchestration☆17Updated last year
- Spark and Python (PySpark) Examples☆39Updated 3 years ago