jimdowling / cjsurfLinks
Lahinch surf predictions with Hopsworks
☆15Updated 2 weeks ago
Alternatives and similar repositories for cjsurf
Users that are interested in cjsurf are comparing it to the libraries listed below
Sorting:
- ☆17Updated 2 years ago
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- ☆86Updated 2 years ago
- This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I r…☆19Updated 4 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- PySpark phonetic and string matching algorithms☆39Updated last year
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 4 years ago
- ☆58Updated last year
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated this week
- Python - Java/Scala API for the Hopsworks feature store☆54Updated last week
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆31Updated 4 years ago
- This repository contains the tpcds queries together with the code required to run this benchmark for dbt and duckdb☆18Updated last year
- Yet Another (Spark) ETL Framework☆21Updated last year
- ☆40Updated 3 years ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Updated last year
- Spark and Delta Lake Workshop☆22Updated 2 years ago
- Read Delta tables without any Spark☆47Updated last year
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 9 months ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- A Table format agnostic data sharing framework☆38Updated last year
- This repository contains recipes for Apache Pinot.☆30Updated 3 months ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆38Updated 10 months ago
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆67Updated last month
- Delta Lake helper methods. No Spark dependency.☆23Updated 8 months ago
- Demonstrating and Building ML pipelines in Airflow☆11Updated 3 years ago
- Demonstration code for MLeap, both Jupyter notebooks and projects☆24Updated 5 years ago
- Scaling Python Machine Learning☆46Updated last year