jason-jz-zhu / databathingLinks
☆22Updated 4 months ago
Alternatives and similar repositories for databathing
Users that are interested in databathing are comparing it to the libraries listed below
Sorting:
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- A Minimalistic Rust Implementation of Delta Sharing Server.☆92Updated 3 months ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- IceRunner is an Apache Arrow Flight Server Implementation for Apache Iceberg Tables☆9Updated 3 months ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆17Updated 4 years ago
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- ☆11Updated 2 years ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 7 months ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- Unity Catalog UI☆40Updated 10 months ago
- Data pipelines from re-usable components☆108Updated 2 years ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆60Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated 3 weeks ago
- Demos of Materialize, the operational data warehouse.☆51Updated 4 months ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆37Updated 4 years ago
- DB API 2 interface for Flight SQL with SQLAlchemy extras.☆39Updated 3 months ago
- ☆33Updated last year
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆24Updated last year
- Yet Another (Spark) ETL Framework☆21Updated last year
- Python stream processing for analytics☆40Updated last week
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 11 months ago
- A Data Mesh demo repository☆13Updated 9 months ago
- Python binding for DataFusion☆59Updated 2 years ago
- DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from da…☆46Updated last month
- Distributed persistent Task Queue running on Dask☆38Updated 2 years ago
- stream data generator☆14Updated last year
- ☆36Updated last year
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated last week
- A cli for spinning up and managing Ray clusters for the Daft Query Engine.☆13Updated 4 months ago
- A leightweight UI for Lakekeeper☆13Updated this week