jimdowling / cjsurfLinks
Lahinch surf predictions with Hopsworks
☆15Updated 4 months ago
Alternatives and similar repositories for cjsurf
Users that are interested in cjsurf are comparing it to the libraries listed below
Sorting:
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- ☆89Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated 2 weeks ago
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 5 years ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆75Updated 5 months ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- A Table format agnostic data sharing framework☆39Updated last year
- PySpark phonetic and string matching algorithms☆39Updated last year
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated 2 years ago
- ☆42Updated 5 years ago
- Spark and Delta Lake Workshop☆22Updated 3 years ago
- Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and mo…☆25Updated 4 years ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Read Delta tables without any Spark☆47Updated last year
- Code that was used as an example during the Data+AI Summit 2020☆15Updated 4 years ago
- Delta Lake Documentation☆50Updated last year
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Delta Lake examples☆229Updated last year
- Utility functions for dbt projects running on Spark☆33Updated 8 months ago
- Capturing model drift and handling its response - Example webinar☆108Updated 6 years ago
- How to evaluate the Quality of your Data with Great Expectations and Spark.☆31Updated 2 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- Modern Techniques for Data Science with Big Datasets☆12Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆56Updated 4 years ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆81Updated last year
- Example of a scalable IoT data processing pipeline setup using Databricks☆32Updated 4 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆113Updated 2 months ago