dwarszawski / amundsen-atlas-types
Atlas custom type definitions
☆16Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for amundsen-atlas-types
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆75Updated 7 months ago
- Amundsen library to place common code for Amundsen microservices to share☆9Updated 3 years ago
- A simple Spark-powered ETL framework that just works 🍺☆178Updated 11 months ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆85Updated 7 months ago
- DataQuality for BigData☆142Updated 11 months ago
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆187Updated last year
- Spark on Kubernetes infrastructure Helm charts repo☆199Updated 2 years ago
- ☆12Updated 2 years ago
- ☆151Updated this week
- ☆63Updated 5 years ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Updated 2 years ago
- A library that provides useful extensions to Apache Spark and PySpark.☆197Updated 2 weeks ago
- ☆77Updated last year
- Data ingestion library for Amundsen to build graph and search index☆206Updated 8 months ago
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆165Updated last week
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆92Updated last month
- A library that brings useful functions from various modern database management systems to Apache Spark☆56Updated last year
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 8 months ago
- Storage connector for Trino☆97Updated this week
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- Setup for running Trino with Hive Metastore on Kubernetes☆98Updated 2 years ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- The Internals of Delta Lake☆183Updated last month
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an A…☆112Updated this week
- Adapter for dbt that executes dbt pipelines on Apache Flink☆85Updated 8 months ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆60Updated 2 months ago
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆61Updated this week
- Kafka Connector for Iceberg tables☆16Updated last year