jason-jz-zhu / databathing
☆22Updated last week
Alternatives and similar repositories for databathing:
Users that are interested in databathing are comparing it to the libraries listed below
- Yet Another (Spark) ETL Framework☆20Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆23Updated last year
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- ☆11Updated 2 years ago
- A Minimalistic Rust Implementation of Delta Sharing Server.☆88Updated this week
- Unity Catalog UI☆39Updated 6 months ago
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆83Updated this week
- Data Catalog for Databases and Data Warehouses☆33Updated last year
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 4 years ago
- Python binding for DataFusion☆59Updated 2 years ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆9Updated last year
- A Table format agnostic data sharing framework☆38Updated last year
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.☆21Updated 9 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated this week
- Simple Workflow Framework - Hamilton + APScheduler = FlowerPower☆15Updated 2 weeks ago
- ERPL is a DuckDB extension to integrate Enterprise Data in your Data Science and ML pipelines within minutes! ERPL connects DuckDB to SAP…☆37Updated 8 months ago
- ☆27Updated last year
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 7 months ago
- Ibis analytics, with Ibis (and more!)☆20Updated 5 months ago
- Demos of Materialize, the operational data warehouse.☆51Updated 2 weeks ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆58Updated last year
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- Delta Acceptance Testing☆20Updated 7 months ago
- ☆37Updated this week
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆31Updated 2 years ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆17Updated 3 years ago
- The Internals of PySpark☆26Updated 2 months ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆62Updated 2 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago