intuit/superglue

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/intuit/superglue)

intuit / superglue

Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.

☆160

Alternatives and similar repositories for superglue

Users that are interested in superglue are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

springernature / fs2-pdf
View on GitHub
Streaming PDF processor for Scala
☆13Apr 2, 2025Updated last year
databand-ai / dbnd
View on GitHub
DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.
☆267Mar 4, 2026Updated 4 months ago
AbsaOSS / spline
View on GitHub
Data Lineage Tracking And Visualization Solution
☆663Updated this week
uber / marmaray
View on GitHub
Generic Data Ingestion & Dispersal Library for Hadoop
☆483Mar 19, 2023Updated 3 years ago
slidoapp / dbt-superset-lineage
View on GitHub
Make dbt docs and Apache Superset talk to one another
☆156Feb 12, 2026Updated 5 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
InfuseAI / piperider
View on GitHub
Code review for data in dbt
☆495Jan 3, 2025Updated last year
MarquezProject / marquez
View on GitHub
Collect, aggregate, and visualize a data ecosystem's metadata
☆2,248Updated this week
databrickslabs / dataframe-rules-engine
View on GitHub
Extensible Rules Engine for custom Dataframe / Dataset validation
☆141May 7, 2024Updated 2 years ago
AbsaOSS / spark-hofs
View on GitHub
Scala API for Apache Spark SQL high-order functions
☆15Aug 4, 2023Updated 2 years ago
fal-ai / dbt_feature_store
View on GitHub
Build your feature store with macros right within your dbt repository
☆39Dec 16, 2022Updated 3 years ago
joerg-schneider / airtunnel
View on GitHub
The sane way of building a data layer in Airflow
☆24Dec 5, 2019Updated 6 years ago
dimajix / flowman
View on GitHub
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…
☆97Updated this week
odpi / egeria
View on GitHub
Egeria core
☆918Updated this week
OpenLineage / OpenLineage
View on GitHub
An Open Standard for lineage metadata collection
☆2,562Updated this week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
tokern / data-lineage
View on GitHub
Generate and Visualize Data Lineage from query history
☆324Aug 4, 2023Updated 2 years ago
grouparoo / grouparoo
View on GitHub
🦘 The Grouparoo Monorepo - open source customer data sync framework
☆778Apr 8, 2022Updated 4 years ago
tokern / lakecli
View on GitHub
A CLI to manage and monitor permissions in AWS Lake Formation
☆25Feb 8, 2023Updated 3 years ago
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,637Updated this week
sodadata / soda-core
View on GitHub
Data Contracts engine for the modern data stack. https://www.soda.io
☆2,397Updated this week
data-catering / data-contract-playground
View on GitHub
Playground site for creating/validating data contracts
☆11Aug 9, 2025Updated 11 months ago
rsyi / whale
View on GitHub
🐳 The stupidly simple CLI workspace for your data warehouse.
☆727Feb 8, 2023Updated 3 years ago
adidas / datamesh-sharing-data-at-scale
View on GitHub
adidas Data Mesh implementation
☆12May 13, 2022Updated 4 years ago
louisguitton / dbt-metadata-utils
View on GitHub
Parse dbt artifacts and search dbt models with Algolia
☆52May 6, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
laserdisc-io / tamer
View on GitHub
Standalone alternatives to Kafka Connect Connectors
☆46Mar 10, 2026Updated 4 months ago
anelendata / tap-rest-api
View on GitHub
Singer.io tap for generic Rest API
☆25Jul 14, 2026Updated last week
dataform-co / dataform
View on GitHub
Dataform is a framework for managing SQL based data operations in BigQuery
☆992Updated this week
fqaiser94 / mse
View on GitHub
Make Structs Easy (MSE)
☆18Jun 22, 2020Updated 6 years ago
swoop-inc / spark-records
View on GitHub
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
☆73Mar 14, 2021Updated 5 years ago
PacktPublishing / GCP-Complete-Google-Data-Engineer-and-Cloud-Architect-Guide-v-
View on GitHub
Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt
☆16Jan 30, 2023Updated 3 years ago
re-data / dbt-re-data
View on GitHub
re_data - fix data issues before your users & CEO would discover them 😊
☆102May 6, 2024Updated 2 years ago
PBWebMedia / airflow-prometheus-exporter
View on GitHub
Export Airflow metrics (from mysql) in prometheus format
☆29Jun 12, 2026Updated last month
re-data / re-data
View on GitHub
re_data - fix data issues before your users & CEO would discover them 😊
☆1,566Apr 30, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
logicalclocks / hopsworks
View on GitHub
Hopsworks - Data-Intensive AI platform with a Feature Store
☆1,301Feb 10, 2025Updated last year
etsy / boundary-layer
View on GitHub
Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform
☆260Jul 19, 2023Updated 3 years ago
reata / sqllineage
View on GitHub
SQL Lineage Analysis Tool powered by Python
☆1,674Updated this week
hotgluexyz / recipes
View on GitHub
Simple samples for writing ETL transform scripts in Python
☆25Jan 20, 2026Updated 6 months ago
amundsen-io / amundsen
View on GitHub
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…
☆4,780Jul 1, 2026Updated 3 weeks ago
streamthoughts / kafka-connect-transform-grok
View on GitHub
Grok Expression Transform for Kafka Connect.
☆16Jun 26, 2026Updated last month
microsoft / hyperspace
View on GitHub
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
☆430Jan 14, 2022Updated 4 years ago