MarquezProject / marquez
Collect, aggregate, and visualize a data ecosystem's metadata
☆1,781Updated last week
Related projects ⓘ
Alternatives and complementary repositories for marquez
- An Open Standard for lineage metadata collection☆1,772Updated this week
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,440Updated last week
- Egeria core☆809Updated this week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,913Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,040Updated this week
- ☆1,610Updated last week
- An open protocol for secure data sharing☆770Updated last week
- Apache Atlas☆1,838Updated this week
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,158Updated this week
- Dremio - the missing link in modern data☆1,381Updated 3 weeks ago
- Generate and Visualize Data Lineage from query history☆311Updated last year
- re_data - fix data issues before your users & CEO would discover them 😊☆1,552Updated 6 months ago
- First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business…☆1,215Updated last month
- SQL Lineage Analysis Tool powered by Python☆1,341Updated 2 months ago
- Efficient data transformation and modeling framework that is backwards compatible with dbt.☆1,824Updated this week
- Hop Orchestration Platform☆984Updated this week
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,309Updated last month
- 📙 Awesome Data Catalogs and Observability Platforms.☆727Updated 3 months ago
- Data Lineage Tracking And Visualization Solution☆604Updated this week
- Dynamically generate Apache Airflow DAGs from YAML configuration files☆1,209Updated this week
- Python API for Deequ☆730Updated last month
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆795Updated this week
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆405Updated last week
- Port(ish) of Great Expectations to dbt test macros☆1,083Updated 2 months ago
- A simplified, lightweight ETL Framework based on Apache Spark☆584Updated 9 months ago
- Apache PyIceberg☆473Updated this week
- MetricFlow allows you to define, build, and maintain metrics in code.☆1,146Updated this week
- Open, Multi-modal Catalog for Data & AI☆2,432Updated this week
- New Generation Opensource Data Stack Demo☆410Updated last year
- Guides and docs to help you get up and running with Apache Airflow.☆800Updated 2 years ago