SETL-Framework / setlLinks
A simple Spark-powered ETL framework that just works πΊ
β182Updated last week
Alternatives and similar repositories for setl
Users that are interested in setl are comparing it to the libraries listed below
Sorting:
- A library that provides useful extensions to Apache Spark and PySpark.β228Updated last week
- Smart Automation Tool for building modern Data Lakes and Data Pipelinesβ124Updated this week
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productiveβ186Updated 2 years ago
- The Internals of Delta Lakeβ184Updated 6 months ago
- A simplified, lightweight ETL Framework based on Apache Sparkβ589Updated last year
- Flowchart for debugging Spark applicationsβ106Updated 10 months ago
- Snowflake Data Source for Apache Spark.β226Updated last month
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are inβ¦β91Updated 2 months ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data piβ¦β96Updated 2 weeks ago
- Spline agent for Apache Sparkβ196Updated 2 weeks ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Aβ¦β126Updated last week
- Code snippets used in demos recorded for the blog.β37Updated last week
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.β344Updated last year
- β63Updated 5 years ago
- DataQuality for BigDataβ144Updated last year
- Spark-Radiant is Apache Spark Performance and Cost Optimizerβ25Updated 7 months ago
- A library that brings useful functions from various modern database management systems to Apache Sparkβ60Updated last year
- A tool to validate data, built around Apache Spark.β101Updated this week
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.β76Updated last year
- Magic to help Spark pipelines upgradeβ34Updated 10 months ago
- ACID Data Source for Apache Spark based on Hive ACIDβ97Updated 4 years ago
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!β231Updated 6 months ago
- Examples of Spark 3.0β47Updated 4 years ago
- Spark style guideβ260Updated 10 months ago
- The Internals of Spark on Kubernetesβ71Updated 3 years ago
- Data Lineage Tracking And Visualization Solutionβ638Updated this week
- Avro SerDe for Apache Spark structured APIs.β235Updated last month
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.β73Updated 4 years ago
- Custom state store providers for Apache Sparkβ92Updated 5 months ago
- Sample processing code using Spark 2.1+ and Scalaβ51Updated 5 years ago