edgBR / delta-lake-polarsLinks
Building a poor man's data lake: Exploring the Power of Polars and Delta Lake
☆10Updated 2 weeks ago
Alternatives and similar repositories for delta-lake-polars
Users that are interested in delta-lake-polars are comparing it to the libraries listed below
Sorting:
- ☆30Updated 11 months ago
- API for distributing Data Lake Data☆11Updated 2 months ago
- Cost Efficient Data Pipelines with DuckDB☆53Updated 3 weeks ago
- dlt-dagster-demo☆11Updated last year
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.☆21Updated 11 months ago
- Automate and streamline the alerting & notification process for dbt test results🐞🚀☆17Updated last month
- ☆18Updated 9 months ago
- Personal project for setting up an open source data warehouse.☆30Updated 4 months ago
- duckdb-etl-framework☆11Updated 5 months ago
- ☆16Updated last year
- Example projects built on MotherDuck☆28Updated 3 weeks ago
- A declarative PySpark framework for row- and aggregate-level data quality validation.☆46Updated this week
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated last year
- Native polars deltalake reader☆9Updated 9 months ago
- Read Apache Arrow batches from ODBC data sources in Python☆65Updated this week
- Azure extension for DuckDB☆59Updated last week
- A python SPark ETL libRary (SPETLR) for Databricks. https://discord.gg/p9bzqGybVW☆20Updated this week
- A DataOps framework for building a lakehouse.☆50Updated this week
- Delta reader for the Ray open-source toolkit for building ML applications☆46Updated last year
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆27Updated last year
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆56Updated last week
- Utility functions for dbt projects running on Spark☆34Updated 3 months ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated 2 years ago
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆52Updated 6 months ago
- ☆39Updated 11 months ago
- A "modern" Strava data pipeline fueled by dlt, duckdb, dbt, and evidence.dev☆33Updated 3 weeks ago
- Stream Processing using Polars☆30Updated 2 years ago
- Fabric Python Notebooks examples☆74Updated last week
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- rust-for-data☆45Updated last year