fugue-project / tutorialsLinks
Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites.
☆113Updated last year
Alternatives and similar repositories for tutorials
Users that are interested in tutorials are comparing it to the libraries listed below
Sorting:
- Possibly the fastest DataFrame-agnostic quality check library in town.☆195Updated last week
- An abstraction layer for parameter tuning☆35Updated 9 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆109Updated 5 months ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- Read Delta tables without any Spark☆47Updated last year
- Plugins, extensions, case studies, articles, and video tutorials for Kedro☆79Updated 6 months ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- First-party plugins maintained by the Kedro team.☆103Updated this week
- Make simple storing test results and visualisation of these in a BI dashboard☆45Updated this week
- ☆129Updated last month
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆54Updated 9 months ago
- The easiest way to integrate Kedro and Great Expectations☆52Updated 2 years ago
- Sample projects using Ploomber.☆86Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆218Updated this week
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆25Updated last year
- Delta Lake helper methods. No Spark dependency.☆23Updated 9 months ago
- Write your dbt models using Ibis☆67Updated 3 months ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆80Updated last year
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆88Updated 2 weeks ago
- Write python locally, execute SQL in your data warehouse☆269Updated 2 years ago
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆84Updated this week
- ✨ A Pydantic to PySpark schema library☆94Updated this week
- Dask integration for Snowflake☆30Updated 7 months ago
- Read Apache Arrow batches from ODBC data sources in Python☆65Updated 3 weeks ago
- A FastMCP tool to search and retrieve Polars API documentation.☆61Updated 3 weeks ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- Pandas helper functions☆31Updated 2 years ago