Nike-Inc / spark-expectations
A Python Library to support running data quality rules while the spark job is running⚡
☆161Updated last month
Related projects: ⓘ
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆185Updated this week
- Delta Lake helper methods in PySpark☆294Updated 2 weeks ago
- Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used …☆309Updated last week
- Delta Lake examples☆201Updated 3 months ago
- Metadata driven Databricks Delta Live Tables framework for bronze/silver pipelines☆142Updated this week
- A dbt adapter for Databricks.☆211Updated this week
- A Swiss-Army-knife for your Data Intelligence platform administration.☆104Updated last month
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆41Updated 2 months ago
- Delta lake and filesystem helper methods☆48Updated 6 months ago
- Examples of Databricks Asset Bundles☆81Updated last week
- Spark style guide☆255Updated last year
- Databricks SQL Connector for Python☆153Updated this week
- Extensible Rules Engine for custom Dataframe / Dataset validation☆134Updated 4 months ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆215Updated last week
- An example showing how to apply software engineering best practices to Databricks notebooks.☆118Updated last month
- Delta Lake Documentation☆45Updated 3 months ago
- Capture deep metrics on one or all assets within a Databricks workspace☆226Updated last week
- Automated migrations to Unity Catalog☆218Updated this week
- Notebooks, terraform, tools to enable setting up Unity Catalog☆44Updated last year
- Databricks SDK for Python (Beta)☆345Updated this week
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆40Updated last month
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- Delta Lake helper methods. No Spark dependency.☆21Updated last week
- A library that provides useful extensions to Apache Spark and PySpark.☆193Updated this week
- Code samples, etc. for Databricks☆59Updated last month
- Snowflake Data Source for Apache Spark.☆213Updated this week
- Databricks CLI☆129Updated this week
- A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT …☆46Updated last year
- ✨ A Pydantic to PySpark schema library☆53Updated this week
- 🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.☆438Updated last week