Spark SQL listener to record lineage information
☆28Jan 24, 2021Updated 5 years ago
Alternatives and similar repositories for spark-lineage
Users that are interested in spark-lineage are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- spark 字段血缘 spark field lineage☆32Jun 7, 2022Updated 4 years ago
- A Spark Atlas connector to track data lineage in Apache Atlas☆268Nov 16, 2022Updated 3 years ago
- Processing videos on Apache Spark☆13Feb 14, 2022Updated 4 years ago
- 跟踪Spark-sql中的字段血缘关系☆21Nov 11, 2024Updated last year
- A HBase datasource implementation for Spark and [MLSQL](http://www.mlsql.tech).☆15Sep 29, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- HDFS rsync-like utility to replicate data between HDFS clusters☆17Jun 16, 2012Updated 14 years ago
- 已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.☆58Nov 11, 2021Updated 4 years ago
- ☆16Nov 8, 2015Updated 10 years ago
- REST job server for Apache Spark☆44May 23, 2025Updated last year
- Mirror of Apache Hive☆33Mar 16, 2020Updated 6 years ago
- spark connector for Milvus☆17Updated this week
- A library that brings useful functions from various modern database management systems to Apache Spark☆63Sep 4, 2023Updated 2 years ago
- simd enabled column imprints☆11Feb 12, 2018Updated 8 years ago
- Data Lineage Tracking And Visualization Solution☆660Jun 2, 2026Updated 2 weeks ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Java task scheduler to execute threads which dependency is managed by directed acyclic graph☆26Feb 2, 2017Updated 9 years ago
- 基于Django的运维平台,包括权限认证,资产管理,监控告警,流程审批,运维发布。☆11Aug 15, 2024Updated last year
- 请求spark rest API获取applications,jobs,stages,executors,rdds,streaming,environment等信息提供监控和报警服务☆11Nov 22, 2018Updated 7 years ago
- Cloudera CDP SDK for Java☆17Jun 8, 2026Updated last week
- Scala Mison implementation☆15Nov 16, 2018Updated 7 years ago
- Ray Framework (https://github.com/ray-project/ray) on Kubernetes☆13Oct 12, 2018Updated 7 years ago
- ☆15Oct 12, 2021Updated 4 years ago
- HBase数据库源代码学习研究(包括代码注释、文档、用于代码分析的测试用例)☆10May 18, 2017Updated 9 years ago
- 提供了solr到elasticsearch的语法翻译引擎,兼容现有的solr语法,提供了基于注解的ORM实现☆12Oct 8, 2015Updated 10 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆11Jul 18, 2021Updated 4 years ago
- My Blog☆76May 3, 2018Updated 8 years ago
- Visualize column-level data lineage in Spark SQL☆92May 13, 2022Updated 4 years ago
- A Swift micro-framework for generating compact identifiers that are time ordered in distributed systems without the need for synchronizat…☆13Mar 10, 2018Updated 8 years ago
- Mock Http Server with zero dependencies☆22Jun 11, 2026Updated last week
- ☆48Dec 19, 2025Updated 5 months ago
- The code implementation for the article "Towards Patronizing and Condescending Language in Chinese Videos: A Multimodal Dataset and Fram…☆16Apr 3, 2025Updated last year
- An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC☆42Oct 1, 2024Updated last year
- Spline agent for Apache Spark☆205May 27, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- sql code autocomplete☆45Sep 2, 2020Updated 5 years ago
- A simplified, lightweight ETL Framework based on Apache Spark☆588Jan 24, 2024Updated 2 years ago
- Shuttle:High Available, High Performance Remote Shuffle Service☆156Mar 28, 2023Updated 3 years ago
- Thoughts on things I find interesting.☆17Dec 19, 2024Updated last year
- ☆20Feb 1, 2017Updated 9 years ago
- ☆26Jul 6, 2024Updated last year
- Source code for TPCx-BB benchmark for Hive and SparkSQL on scale factor of 300 GB☆10Jun 26, 2018Updated 7 years ago