intuit / superglue
Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.
β157Updated 2 years ago
Alternatives and similar repositories for superglue:
Users that are interested in superglue are comparing it to the libraries listed below
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data piβ¦β94Updated last week
- A simple Spark-powered ETL framework that just works πΊβ181Updated last month
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL databaseβ75Updated 3 years ago
- A tool to validate data, built around Apache Spark.β101Updated last month
- Data Tools Subjective Listβ83Updated last year
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.htmlβ61Updated 2 years ago
- Data ingestion library for Amundsen to build graph and search indexβ205Updated last year
- β80Updated last week
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multipleβ¦β26Updated 3 years ago
- Schema modelling framework for decentralised domain-driven ownership of data.β252Updated last year
- Metadata service library for Amundsenβ83Updated last month
- An implementation of the DatasourceV2 interface of Apache Sparkβ’ for writing Spark Datasets to Apache Druidβ’.β41Updated last week
- Snowflake Data Source for Apache Spark.β225Updated 3 weeks ago
- Use SQL to build ELT pipelines on a data lakehouse.β286Updated 2 years ago
- Generate and Visualize Data Lineage from query historyβ324Updated last year
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)β61Updated 5 months ago
- ETLy is an add-on dashboard service on top of Apache Airflow.β68Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Coreβ33Updated last year
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an orβ¦β92Updated 2 years ago
- The metrics layer for your data. Join us at https://metriql.com/slackβ304Updated 2 years ago
- Sample configuration to deploy a modern data platform.β88Updated 3 years ago
- A Table format agnostic data sharing frameworkβ38Updated last year
- dbt's adapter for dremioβ48Updated 2 years ago
- Adapter for dbt that executes dbt pipelines on Apache Flinkβ95Updated last year
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesβ73Updated last year
- Yet Another (Spark) ETL Frameworkβ21Updated last year
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.β29Updated this week
- Amundsen Gremlinβ21Updated 2 years ago
- The Internals of Spark on Kubernetesβ71Updated 2 years ago
- re_data - fix data issues before your users & CEO would discover them πβ98Updated 11 months ago