fredrikhgrelland / data-meshLinks
A cloud native data mesh implementation 
☆12Updated 4 years ago
Alternatives and similar repositories for data-mesh
Users that are interested in data-mesh are comparing it to the libraries listed below
Sorting:
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated last month
 - ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆94Updated 3 years ago
 - Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆145Updated last year
 - Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 7 years ago
 - Data pipelines from re-usable components☆107Updated 2 years ago
 - A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆91Updated last week
 - DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆267Updated 7 months ago
 - Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆125Updated 4 years ago
 - Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 2 years ago
 - Python binding for DataFusion☆59Updated 3 years ago
 - This repository is no longer maintained.☆15Updated 3 years ago
 - Deploy dask on YARN clusters☆69Updated last year
 - Data Catalog for Databases and Data Warehouses☆35Updated last year
 - A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆25Updated 2 years ago
 - Apache DataLab (incubating)☆152Updated 2 years ago
 - Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year
 - big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 5 years ago
 - Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated 2 weeks ago
 - Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆62Updated 2 years ago
 - [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
 - Fake Pandas / PySpark DataFrame creator☆48Updated last year
 - Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
 - 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated 2 years ago
 - Tool to automate data quality checks on data pipelines☆253Updated 3 years ago
 - Helpers & syntactic sugar for PySpark.☆62Updated 2 years ago
 - Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆96Updated last month
 - A library that brings useful functions from various modern database management systems to Apache Spark☆60Updated 2 years ago
 - python automatic data quality check toolkit☆282Updated 5 years ago
 - Data Lineage Tracing Library☆23Updated 3 years ago
 - Open-source metadata collector based on ODD Specification☆44Updated last year