fredrikhgrelland / data-meshLinks
A cloud native data mesh implementation
☆12Updated 4 years ago
Alternatives and similar repositories for data-mesh
Users that are interested in data-mesh are comparing it to the libraries listed below
Sorting:
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆90Updated 2 months ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 7 years ago
- Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆144Updated last year
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- Apache DataLab (incubating)☆152Updated last year
- Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆266Updated 4 months ago
- Helpers & syntactic sugar for PySpark.☆62Updated 2 years ago
- Python - Java/Scala API for the Hopsworks feature store☆54Updated this week
- Data pipelines from re-usable components☆107Updated 2 years ago
- This repository is no longer maintained.☆15Updated 3 years ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated last month
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆24Updated last year
- Read Delta tables without any Spark☆47Updated last year
- General Metadata Architecture☆127Updated last week
- Materials for Apache Arrow workshop at VLDB 2019☆42Updated 5 years ago
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated 2 years ago
- ☆70Updated 7 months ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆37Updated 4 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated 2 weeks ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 4 years ago
- Data Lineage Tracing Library☆23Updated 3 years ago
- Open-source metadata collector based on ODD Specification☆44Updated last year
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated 10 months ago
- Dask integration for Snowflake☆30Updated this week
- big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 5 years ago
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- Data Tools Subjective List☆86Updated last year
- Python binding for DataFusion☆59Updated 3 years ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated last week