danielbeach / lakescumLinks
A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.
☆23Updated last year
Alternatives and similar repositories for lakescum
Users that are interested in lakescum are comparing it to the libraries listed below
Sorting:
- Utility functions for dbt projects running on Spark☆34Updated 3 months ago
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- Pytest plugin for dbt core☆60Updated 4 months ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- A dbt-core python package that automates the management and creation of dbt groups, contracts, access, and versions.☆121Updated 4 months ago
- A "modern" Strava data pipeline fueled by dlt, duckdb, dbt, and evidence.dev☆33Updated 3 weeks ago
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆52Updated 6 months ago
- Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principle…☆115Updated 2 months ago
- 🏁 A sweet and speedy code generator for dbt 🏎️✨☆27Updated 11 months ago
- Generate DBT tests based on sample data☆36Updated last year
- New generation opensource data stack☆68Updated 3 years ago
- A DataOps framework for building a lakehouse.☆50Updated this week
- Macros for generating dbt model data profiles☆88Updated 6 months ago
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 9 months ago
- A dbt package for easily using production data in a development environment.☆42Updated 2 weeks ago
- A dbt-Core package for generating models from an activity stream.☆42Updated last year
- Repo for orienting dbt users to the Dagster asset framework☆54Updated 2 years ago
- A bunch of hacks developed around dbt☆48Updated 5 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated 2 years ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- Personal project for setting up an open source data warehouse.☆30Updated 4 months ago
- Cost Efficient Data Pipelines with DuckDB☆53Updated 3 weeks ago
- Alto is a versatile data integration tool that allows you to easily run Singer plugins, build and cache PEX files encapsulating those plu…☆61Updated 2 years ago
- A framework to manage data, continuously☆32Updated 4 months ago
- The shared semantic layer definitions that dbt-core and MetricFlow use.☆78Updated 2 months ago
- Palm CLI - the tool-belt for data teams☆47Updated last year
- Make dbt great again! Enables end user to extend dbt to his/her needs☆76Updated 2 weeks ago
- Delta Lake Documentation☆49Updated 11 months ago
- Example Dagster Cloud code for the Hooli Data Engineering organization.☆4Updated this week
- A Table format agnostic data sharing framework☆38Updated last year