fredrikhgrelland / data-meshLinks
A cloud native data mesh implementation
☆12Updated 4 years ago
Alternatives and similar repositories for data-mesh
Users that are interested in data-mesh are comparing it to the libraries listed below
Sorting:
- Helpers & syntactic sugar for PySpark.☆62Updated 2 years ago
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆267Updated 6 months ago
- Python binding for DataFusion☆59Updated 3 years ago
- Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆144Updated last year
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆90Updated last week
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 4 years ago
- This repository is no longer maintained.☆15Updated 3 years ago
- Open-source metadata collector based on ODD Specification☆44Updated last year
- Read Delta tables without any Spark☆47Updated last year
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 7 years ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆25Updated 2 years ago
- Ibis analytics, with Ibis (and more!)☆22Updated last year
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆62Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last year
- ☆22Updated last month
- big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 5 years ago
- Apache DataLab (incubating)☆152Updated 2 years ago
- Data Tools Subjective List☆86Updated 2 years ago
- Data pipelines from re-usable components☆107Updated 2 years ago
- The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.☆19Updated last year
- A library on top of either pex or conda-pack to make your Python code easily available on a cluster☆45Updated 2 weeks ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated 2 weeks ago
- A parser for SQL, which gives back identifiers and a hierarchical model for lineage tracking☆20Updated 7 years ago
- ☆107Updated 2 years ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated 3 weeks ago
- A tool and library for easily deploying applications on Apache YARN☆144Updated last year
- ☆23Updated last year
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆95Updated 2 years ago
- Arrow, pydantic style☆85Updated 2 years ago