intuit / superglue
Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.
โ156Updated 2 years ago
Alternatives and similar repositories for superglue:
Users that are interested in superglue are comparing it to the libraries listed below
- A simple Spark-powered ETL framework that just works ๐บโ181Updated this week
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data piโฆโ94Updated 3 weeks ago
- Generate and Visualize Data Lineage from query historyโ322Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Coreโ33Updated last year
- A library that provides useful extensions to Apache Spark and PySpark.โ221Updated last week
- Data ingestion library for Amundsen to build graph and search indexโ205Updated last year
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL databaseโ73Updated 3 years ago
- A tool to validate data, built around Apache Spark.โ101Updated this week
- Schema modelling framework for decentralised domain-driven ownership of data.โ251Updated last year
- The Workload Analyzer collects Prestoยฎ and Trino workload statistics, and analyzes themโ135Updated last year
- Rules based grant management for Snowflakeโ40Updated 6 years ago
- Snowflake Data Source for Apache Spark.โ222Updated 4 months ago
- Airflow support for Marquezโ32Updated 4 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0โ97Updated 2 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)โ61Updated 3 months ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframesโ63Updated 2 years ago
- Metadata service library for Amundsenโ83Updated last week
- Data Tools Subjective Listโ83Updated last year
- A library that brings useful functions from various modern database management systems to Apache Sparkโ58Updated last year
- Great Expectations Airflow operatorโ161Updated last week
- DataQuality for BigDataโ144Updated last year
- re_data - fix data issues before your users & CEO would discover them ๐โ98Updated 10 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.htmlโ61Updated 2 years ago
- โ79Updated last year
- โ63Updated 5 years ago
- Pylint plugin for static code analysis on Airflow codeโ93Updated 4 years ago
- Yet Another (Spark) ETL Frameworkโ20Updated last year
- Apache DataLab (incubating)โ153Updated last year
- A repository of sample code to accompany our blog post on Airflow and dbt.โ171Updated last year
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesโ73Updated last year