Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.
☆161Dec 10, 2022Updated 3 years ago
Alternatives and similar repositories for superglue
Users that are interested in superglue are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Streaming PDF processor for Scala☆13Apr 2, 2025Updated 11 months ago
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆268Mar 4, 2026Updated 3 weeks ago
- A Go library that provides unification of identical operations (e.g. API requests).☆18Mar 27, 2023Updated 2 years ago
- Data Lineage Tracking And Visualization Solution☆656Updated this week
- Generic Data Ingestion & Dispersal Library for Hadoop☆482Mar 19, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Collaboration app for sharing and reviewing jupyter notebooks☆16May 25, 2025Updated 10 months ago
- Make dbt docs and Apache Superset talk to one another☆157Feb 12, 2026Updated last month
- Collect, aggregate, and visualize a data ecosystem's metadata☆2,149Updated this week
- Code review for data in dbt☆495Jan 3, 2025Updated last year
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆98Updated this week
- Egeria core☆899Updated this week
- The sane way of building a data layer in Airflow☆24Dec 5, 2019Updated 6 years ago
- Build your feature store with macros right within your dbt repository☆39Dec 16, 2022Updated 3 years ago
- A data lineage tool detects table dependencies from rendered SQL statements.☆30Mar 14, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- An Open Standard for lineage metadata collection☆2,362Updated this week
- Generate and Visualize Data Lineage from query history☆327Aug 4, 2023Updated 2 years ago
- 🐳 The stupidly simple CLI workspace for your data warehouse.☆728Feb 8, 2023Updated 3 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,596Updated this week
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- Data Contracts engine for the modern data stack. https://www.soda.io☆2,311Updated this week
- A Java library to determine probability of objects being similar.☆262Dec 12, 2025Updated 3 months ago
- Simple samples for writing ETL transform scripts in Python☆25Jan 20, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 🦘 The Grouparoo Monorepo - open source customer data sync framework☆774Apr 8, 2022Updated 3 years ago
- Playground site for creating/validating data contracts☆11Aug 9, 2025Updated 7 months ago
- adidas Data Mesh implementation☆12May 13, 2022Updated 3 years ago
- Singer.io tap for generic Rest API☆24Mar 3, 2026Updated 3 weeks ago
- Export Airflow metrics (from mysql) in prometheus format☆29Apr 15, 2025Updated 11 months ago
- Parse dbt artifacts and search dbt models with Algolia☆52May 6, 2021Updated 4 years ago
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,289Feb 10, 2025Updated last year
- Dataform is a framework for managing SQL based data operations in BigQuery☆967Mar 17, 2026Updated last week
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Standalone alternatives to Kafka Connect Connectors☆46Mar 10, 2026Updated 2 weeks ago
- Make Structs Easy (MSE)☆18Jun 22, 2020Updated 5 years ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Jan 30, 2023Updated 3 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆101May 6, 2024Updated last year
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,751Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,439Updated this week
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆431Jan 14, 2022Updated 4 years ago