AbsaOSS/spline-spark-agent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AbsaOSS/spline-spark-agent)

AbsaOSS / spline-spark-agent

Spline agent for Apache Spark

☆206

Alternatives and similar repositories for spline-spark-agent

Users that are interested in spline-spark-agent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AbsaOSS / spline
View on GitHub
Data Lineage Tracking And Visualization Solution
☆662Jul 13, 2026Updated last week
RoundYuanYuan / spark-field-lineage
View on GitHub
spark 字段血缘 spark field lineage
☆32Jun 7, 2022Updated 4 years ago
AbsaOSS / spline-getting-started
View on GitHub
☆26Jun 24, 2026Updated 3 weeks ago
hortonworks-spark / spark-atlas-connector
View on GitHub
A Spark Atlas connector to track data lineage in Apache Atlas
☆268Nov 16, 2022Updated 3 years ago
frankyu8 / ushas
View on GitHub
This project is used for tracking lineage when using spark. Our team is aimed at enhancing the ability of column relation during logical …
☆20Jan 7, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,353Updated this week
G-Research / spark-extension
View on GitHub
A library that provides useful extensions to Apache Spark and PySpark.
☆238Jul 1, 2026Updated 2 weeks ago
AbsaOSS / ABRiS
View on GitHub
Avro SerDe for Apache Spark structured APIs.
☆242Jun 10, 2025Updated last year
cubefs / compass
View on GitHub
Compass is a task diagnosis platform for bigdata
☆405Nov 23, 2024Updated last year
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 3 weeks ago
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
fsk119 / flink-pageviews-demo
View on GitHub
A simple demo about Flink Upsert-kafka
☆16Mar 11, 2021Updated 5 years ago
OpenLineage / OpenLineage
View on GitHub
An Open Standard for lineage metadata collection
☆2,552Updated this week
RHobart / spark-lineage-parent
View on GitHub
跟踪Spark-sql中的字段血缘关系
☆21Nov 11, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
awslabs / aws-glue-data-catalog-client-for-apache-hive-metastore
View on GitHub
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…
☆230May 18, 2026Updated 2 months ago
YotpoLtd / metorikku
View on GitHub
A simplified, lightweight ETL Framework based on Apache Spark
☆588Jan 24, 2024Updated 2 years ago
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,149Updated this week
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,635Updated this week
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
LucaCanali / sparkMeasure
View on GitHub
This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…
☆827May 19, 2026Updated 2 months ago
apache / celeborn
View on GitHub
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
☆1,056Updated this week
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
thesquelched / spark-lineage
View on GitHub
Spark SQL listener to record lineage information
☆28Jan 24, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
AbsaOSS / pramen
View on GitHub
Resilient data pipeline framework running on Apache Spark
☆31Updated this week
HamaWhiteGG / flink-sql-lineage
View on GitHub
The Lineage Analysis system for FlinkSQL supports advanced syntax such as Watermark, UDTF, CEP, Windowing TVFs, and CTAS.
☆413Nov 20, 2025Updated 8 months ago
TristanBilot / airflow-rbac-roles-cli
View on GitHub
A tool to create Airflow RBAC roles with dag-level permissions from cli.
☆13Sep 7, 2023Updated 2 years ago
NetEase / spark-alarm
View on GitHub
Alerting and monitoring tool for Apache Spark
☆23May 20, 2022Updated 4 years ago
melin / superior-sql-parser
View on GitHub
基于 antlr4 的多种数据库SQL解析器，获取SQL中元数据，可用于数据平台产品中的多个场景：ddl语句提取元数据、sql 权限校验、表级血缘、sql语法校验等场景。支持spark、flink、gauss、starrocks、Oracle、MYSQL、Postgresq…
☆417Jun 22, 2026Updated 3 weeks ago
yaooqinn / itachi
View on GitHub
A library that brings useful functions from various modern database management systems to Apache Spark
☆63Sep 4, 2023Updated 2 years ago
AbsaOSS / cobrix
View on GitHub
A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
☆167Jun 22, 2026Updated 3 weeks ago
xskipper-io / xskipper
View on GitHub
An Extensible Data Skipping Framework
☆50Jul 15, 2025Updated last year
aistack / sql-booster
View on GitHub
This is a library for SQL optimizing/rewriting including Materialized View rewrite
☆70Jun 21, 2022Updated 4 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
irajhedayati / savro
View on GitHub
Some Avro operations in Scala
☆10Jun 29, 2026Updated 3 weeks ago
microsoft / hyperspace
View on GitHub
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
☆430Jan 14, 2022Updated 4 years ago
apache / ranger
View on GitHub
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
☆1,065Updated this week
s-wool / presto-csd
View on GitHub
presto for cloudera manager
☆15Apr 22, 2016Updated 10 years ago
leesf / hudi-demos
View on GitHub
汇总Apache Hudi中的一些Demo，便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi)
☆74Sep 13, 2020Updated 5 years ago
Azure / azure-schema-registry-for-kafka
View on GitHub
Kafka support for Azure Schema Registry.
☆17Jun 6, 2025Updated last year
apache / hudi
View on GitHub
Upserts, Deletes And Incremental Processing on Big Data.
☆6,193Updated this week