ykursadkaya / pyspark-DockerLinks
PySpark in Docker Containers
☆29Updated 3 years ago
Alternatives and similar repositories for pyspark-Docker
Users that are interested in pyspark-Docker are comparing it to the libraries listed below
Sorting:
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆168Updated 2 years ago
- pytest plugin to run the tests with support of pyspark☆86Updated 6 months ago
- Delta lake and filesystem helper methods☆51Updated last year
- ☆202Updated 2 years ago
- The source code for the book Modern Data Engineering with Apache Spark☆39Updated 3 years ago
- 🐍💨 Airflow tutorial for PyCon 2019☆87Updated 3 years ago
- Delta Lake examples☆235Updated last year
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Updated 4 years ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆89Updated 4 years ago
- Airflow operator that can send messages to MS Teams☆86Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Teradata SQL Driver for Python☆69Updated this week
- Repository of sample Databricks notebooks☆273Updated last year
- Pandas helper functions☆31Updated 2 years ago
- Streaming demo dbt☆17Updated last year
- Data Engineering with Spark and Delta Lake☆106Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- Writing PySpark logs in Apache Spark and Databricks☆17Updated 3 years ago
- Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline☆152Updated last year
- Read Delta tables without any Spark☆47Updated last year
- Great Expectations Airflow operator☆169Updated last week
- An example MLflow project☆280Updated last year
- Cloned by the `dbt init` task☆62Updated last year
- scaffold of Apache Airflow executing Docker containers☆85Updated 3 years ago
- Materials for the next course☆25Updated 2 years ago
- Example of how to leverage Apache Spark distributed capabilities to call REST-API using a UDF☆52Updated 3 years ago
- VSCode extension to work with Databricks☆131Updated this week
- ☆179Updated 3 years ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆87Updated 2 years ago
- ☆93Updated 2 years ago