YotpoLtd / metorikkuLinks
A simplified, lightweight ETL Framework based on Apache Spark
โ586Updated last year
Alternatives and similar repositories for metorikku
Users that are interested in metorikku are comparing it to the libraries listed below
Sorting:
- A simple Spark-powered ETL framework that just works ๐บโ181Updated last month
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spaโฆโ764Updated 2 weeks ago
- Qubole Sparklens tool for performance tuning Apache Sparkโ579Updated 11 months ago
- The Internals of Spark SQLโ467Updated 5 months ago
- Data Lineage Tracking And Visualization Solutionโ630Updated last week
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productiveโ185Updated 2 years ago
- Spline agent for Apache Sparkโ193Updated last week
- Essential Spark extensions and helper methods โจ๐ฒโ760Updated 7 months ago
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)โ447Updated last week
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.โ344Updated last year
- Avro SerDe for Apache Spark structured APIs.โ236Updated last week
- Build configuration-driven ETL pipelines on Apache Sparkโ159Updated 2 years ago
- The Internals of Delta Lakeโ184Updated 5 months ago
- โ310Updated 6 years ago
- A library that provides useful extensions to Apache Spark and PySpark.โ225Updated 3 months ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Aโฆโ125Updated 3 weeks ago
- The Internals of Spark Structured Streamingโ419Updated 2 years ago
- Spark style guideโ259Updated 8 months ago
- Snowflake Data Source for Apache Spark.โ226Updated 2 weeks ago
- A load balancer / proxy / gateway for prestodbโ358Updated 10 months ago
- A Spark plugin for reading and writing Excel filesโ498Updated this week
- DataQuality for BigDataโ144Updated last year
- A tool for monitoring and tuning Spark jobs for efficiency.โ358Updated 2 years ago
- Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.โ284Updated this week
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apโฆโ299Updated last year
- Spark package for checking data qualityโ221Updated 5 years ago
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog aโฆโ222Updated 3 months ago
- Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.โ913Updated last week
- The Workload Analyzer collects Prestoยฎ and Trino workload statistics, and analyzes themโ135Updated last year
- ACID Data Source for Apache Spark based on Hive ACIDโ97Updated 3 years ago