fredrikhgrelland / data-mesh
A cloud native data mesh implementation
☆12Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for data-mesh
- Unified Distributed Execution☆51Updated last month
- ☆22Updated 2 years ago
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆22Updated last year
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆56Updated last year
- A parser for SQL, which gives back identifiers and a hierarchical model for lineage tracking☆20Updated 6 years ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆31Updated last week
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 3 years ago
- Python - Java/Scala API for the Hopsworks feature store☆53Updated this week
- This repository is no longer maintained.☆15Updated 2 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 3 years ago
- ☆29Updated 11 months ago
- Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.☆65Updated 3 years ago
- An extension for Jupyter Lab & Jupyter Notebook to monitor Apache Spark (pyspark) from notebooks☆46Updated 9 months ago
- A collection of python utility functions☆12Updated 4 months ago
- Dask integration for Snowflake☆30Updated last week
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆17Updated 2 years ago
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 2 months ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆92Updated 2 years ago
- Instant access to the Spark cluster from anywhere☆16Updated 4 years ago
- Data Lineage Tracing Library☆22Updated 2 years ago
- A library on top of either pex or conda-pack to make your Python code easily available on a cluster☆45Updated this week
- Data pipelines from re-usable components☆106Updated last year
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆17Updated 3 years ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆28Updated 2 years ago
- Apache Arrow Flight example☆11Updated 4 years ago
- Data Catalog for Databases and Data Warehouses☆31Updated 10 months ago
- Ibis analytics, with Ibis (and more!)☆19Updated last month
- Fake Pandas / PySpark DataFrame creator☆42Updated 8 months ago