Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.
☆160Dec 10, 2022Updated 3 years ago
Alternatives and similar repositories for superglue
Users that are interested in superglue are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Streaming PDF processor for Scala☆13Apr 2, 2025Updated last year
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆267Mar 4, 2026Updated 2 months ago
- A Go library that provides unification of identical operations (e.g. API requests).☆18Mar 27, 2023Updated 3 years ago
- Foremast-brain is a component of Foremast project.☆17Feb 3, 2023Updated 3 years ago
- Data Lineage Tracking And Visualization Solution☆657Apr 23, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Apr 15, 2026Updated 3 weeks ago
- Generic Data Ingestion & Dispersal Library for Hadoop☆481Mar 19, 2023Updated 3 years ago
- Collaboration app for sharing and reviewing jupyter notebooks☆16May 25, 2025Updated 11 months ago
- Make dbt docs and Apache Superset talk to one another☆157Feb 12, 2026Updated 2 months ago
- Collect, aggregate, and visualize a data ecosystem's metadata☆2,177Apr 28, 2026Updated last week
- Code review for data in dbt☆494Jan 3, 2025Updated last year
- Egeria core☆907Updated this week
- The sane way of building a data layer in Airflow☆24Dec 5, 2019Updated 6 years ago
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- An Open Standard for lineage metadata collection☆2,430Apr 29, 2026Updated last week
- 🐳 The stupidly simple CLI workspace for your data warehouse.☆728Feb 8, 2023Updated 3 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,614Updated this week
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- Simple samples for writing ETL transform scripts in Python☆25Jan 20, 2026Updated 3 months ago
- 🦘 The Grouparoo Monorepo - open source customer data sync framework☆772Apr 8, 2022Updated 4 years ago
- Playground site for creating/validating data contracts☆11Aug 9, 2025Updated 8 months ago
- adidas Data Mesh implementation☆12May 13, 2022Updated 3 years ago
- Singer.io tap for generic Rest API☆25Mar 30, 2026Updated last month
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Export Airflow metrics (from mysql) in prometheus format☆29Apr 15, 2025Updated last year
- Parse dbt artifacts and search dbt models with Algolia☆52May 6, 2021Updated 5 years ago
- Hopsworks - Data-Intensive AI platform with a Feature Store☆1,294Feb 10, 2025Updated last year
- Dataform is a framework for managing SQL based data operations in BigQuery☆973Updated this week
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 5 years ago
- Standalone alternatives to Kafka Connect Connectors☆46Mar 10, 2026Updated last month
- Make Structs Easy (MSE)☆18Jun 22, 2020Updated 5 years ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Jan 30, 2023Updated 3 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆101May 6, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A frictionless integrated platform for notebook☆82Jan 5, 2023Updated 3 years ago
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,762Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,453Updated this week
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆430Jan 14, 2022Updated 4 years ago
- An Apache Mesos Framework that allows for replaying load over and over and over (and over) again☆10Aug 10, 2015Updated 10 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆1,569Apr 30, 2024Updated 2 years ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆127Aug 3, 2021Updated 4 years ago