hortonworks-spark/spark-atlas-connector

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hortonworks-spark/spark-atlas-connector)

hortonworks-spark / spark-atlas-connector

A Spark Atlas connector to track data lineage in Apache Atlas

☆268

Alternatives and similar repositories for spark-atlas-connector

Users that are interested in spark-atlas-connector are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apache / atlas
View on GitHub
Apache Atlas - Open Metadata Management and Governance capabilities across the Hadoop platform and beyond
☆2,120Updated this week
AbsaOSS / spline
View on GitHub
Data Lineage Tracking And Visualization Solution
☆662Updated this week
thesquelched / spark-lineage
View on GitHub
Spark SQL listener to record lineage information
☆28Jan 24, 2021Updated 5 years ago
AbsaOSS / spline-spark-agent
View on GitHub
Spline agent for Apache Spark
☆206Updated this week
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,351Updated this week
bernhard-42 / pyspark-atlas
View on GitHub
PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection
☆17Jan 12, 2017Updated 9 years ago
yaooqinn / spark-ranger
View on GitHub
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.
☆59Nov 11, 2021Updated 4 years ago
allwefantasy / sql-code-intelligence
View on GitHub
sql code autocomplete
☆45Sep 2, 2020Updated 5 years ago
RoundYuanYuan / spark-field-lineage
View on GitHub
spark 字段血缘 spark field lineage
☆32Jun 7, 2022Updated 4 years ago
qubole / sparklens
View on GitHub
Qubole Sparklens tool for performance tuning Apache Spark
☆592Jun 26, 2024Updated 2 years ago
mantoudev / atlas_cn
View on GitHub
Atlas官方文档中文版
☆69Jun 19, 2019Updated 7 years ago
hortonworks-spark / spark-llap
View on GitHub
☆102Mar 23, 2020Updated 6 years ago
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
apache / griffin
View on GitHub
Mirror of Apache griffin
☆1,169Aug 3, 2025Updated 11 months ago
maropu / spark-sql-flow-plugin
View on GitHub
Visualize column-level data lineage in Spark SQL
☆92May 13, 2022Updated 4 years ago
mrpowers-io / spark-daria
View on GitHub
Essential Spark extensions and helper methods ✨😲
☆767Jun 22, 2026Updated 3 weeks ago
oap-project / sql-ds-cache
View on GitHub
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
☆37Jan 3, 2023Updated 3 years ago
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
DTStack / flinkStreamSQL
View on GitHub
基于开源的flink，对其实时sql进行扩展；主要实现了流与维表的join，支持原生flink SQL所有的语法
☆2,051Feb 21, 2024Updated 2 years ago
apache / ranger
View on GitHub
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
☆1,062Updated this week
HamaWhiteGG / flink-sql-lineage
View on GitHub
The Lineage Analysis system for FlinkSQL supports advanced syntax such as Watermark, UDTF, CEP, Windowing TVFs, and CTAS.
☆413Nov 20, 2025Updated 7 months ago
apache / submarine
View on GitHub
Submarine is Cloud Native Machine Learning Platform.
☆706Apr 3, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 2 weeks ago
apache / livy
View on GitHub
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
☆959Updated this week
apache / linkis
View on GitHub
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications…
☆3,407Updated this week
YotpoLtd / metorikku
View on GitHub
A simplified, lightweight ETL Framework based on Apache Spark
☆588Jan 24, 2024Updated 2 years ago
falarica / presto-gateway
View on GitHub
Presto Gateway routes query based on policy.
☆12Sep 15, 2020Updated 5 years ago
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,148Updated this week
byzer-org / byzer-lang
View on GitHub
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
☆1,835May 29, 2024Updated 2 years ago
paypal / NNAnalytics
View on GitHub
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
☆121Nov 25, 2025Updated 7 months ago
apache / hbase-connectors
View on GitHub
Apache HBase Connectors
☆244Updated this week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 2 weeks ago
apache / uniffle
View on GitHub
Uniffle is a high performance, general purpose Remote Shuffle Service.
☆450Updated this week
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,905Updated this week
LucaCanali / sparkMeasure
View on GitHub
This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…
☆827May 19, 2026Updated last month
scxwhite / parseX
View on GitHub
sql解析工具。主要解析hive sql、spark sql、presto sql。从sql中解析出输入表、输出表以及字段等信息
☆98Jun 14, 2023Updated 3 years ago
smart-data-lake / smart-data-lake
View on GitHub
Smart Automation Tool for building modern Data Lakes and Data Pipelines
☆129Updated this week
WeBankFinTech / Scriptis
View on GitHub
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, res…
☆813Dec 11, 2024Updated last year