SETL-Framework / setlLinks
A simple Spark-powered ETL framework that just works πΊ
β182Updated last month
Alternatives and similar repositories for setl
Users that are interested in setl are comparing it to the libraries listed below
Sorting:
- Smart Automation Tool for building modern Data Lakes and Data Pipelinesβ124Updated last week
- The Internals of Delta Lakeβ186Updated 8 months ago
- A simplified, lightweight ETL Framework based on Apache Sparkβ589Updated last year
- A library that provides useful extensions to Apache Spark and PySpark.β229Updated last month
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productiveβ184Updated 2 years ago
- Snowflake Data Source for Apache Spark.β229Updated 2 weeks ago
- Flowchart for debugging Spark applicationsβ107Updated 11 months ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are inβ¦β91Updated 4 months ago
- Spline agent for Apache Sparkβ197Updated last week
- A library that brings useful functions from various modern database management systems to Apache Sparkβ60Updated 2 years ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizerβ25Updated 8 months ago
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.β76Updated last year
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.β344Updated last year
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data piβ¦β96Updated last week
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Aβ¦β127Updated 2 weeks ago
- Avro SerDe for Apache Spark structured APIs.β235Updated 3 months ago
- Data Lineage Tracking And Visualization Solutionβ638Updated this week
- β63Updated 5 years ago
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!β233Updated 7 months ago
- ACID Data Source for Apache Spark based on Hive ACIDβ97Updated 4 years ago
- A tool to validate data, built around Apache Spark.β100Updated this week
- Qubole Sparklens tool for performance tuning Apache Sparkβ583Updated last year
- DataQuality for BigDataβ144Updated last year
- Spark style guideβ262Updated 11 months ago
- Code snippets used in demos recorded for the blog.β38Updated 3 weeks ago
- The Internals of Spark on Kubernetesβ71Updated 3 years ago
- Custom state store providers for Apache Sparkβ92Updated 7 months ago
- Examples of Spark 3.0β46Updated 4 years ago
- Sample processing code using Spark 2.1+ and Scalaβ51Updated 5 years ago
- The Internals of Spark SQLβ474Updated this week