patvarilly / python-and-spark-for-data-analysis
A four-day course on Python, the Scientific Python stack and PySpark, adapted from a training course I gave to one of our clients in December 2015
☆10Updated 8 years ago
Alternatives and similar repositories for python-and-spark-for-data-analysis:
Users that are interested in python-and-spark-for-data-analysis are comparing it to the libraries listed below
- Airflow training for the crunch conf☆104Updated 6 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- ☆16Updated 7 years ago
- A simple introduction to using spark ml pipelines☆26Updated 6 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- PySpark phonetic and string matching algorithms☆37Updated 11 months ago
- Code to build a simple analytics data pipeline with Python☆102Updated 7 years ago
- Repository used for Spark Trainings☆53Updated last year
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆46Updated last year
- Udacity Data Pipeline Exercises☆15Updated 4 years ago
- Sharing interesting and noteworthy Data Engineering content☆65Updated 8 years ago
- How to build an awesome data engineering team☆99Updated 5 years ago
- ☆196Updated last year
- 🐍💨 Airflow tutorial for PyCon 2019☆85Updated 2 years ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆86Updated 3 years ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 4 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Helping you get Airflow running in production.☆9Updated 5 years ago
- Learn the pyspark API through pictures and simple examples☆169Updated 3 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 8 years ago
- Example of an ETL Pipeline using Airflow☆32Updated 7 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Updated 5 years ago
- Tutorial repo for the article "ML in Production"☆30Updated last year
- Course materials for my data pipeline video course with O'Reilly☆194Updated 7 years ago
- MLFlow Spark Summit 2019 Presentation☆67Updated 5 years ago
- Various data stream/batch process demo with Apache Scala Spark 🚀☆11Updated 4 years ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 6 months ago