Library for organizing batch processing pipelines in Apache Spark
β42Jan 4, 2017Updated 9 years ago
Alternatives and similar repositories for spark-flow
Users that are interested in spark-flow are comparing it to the libraries listed below
Sorting:
- Scala library for converting Spark rows to case classesβ11Mar 14, 2017Updated 8 years ago
- π» CLI for reporting events to Faros platformβ14Jan 30, 2026Updated last month
- β12Aug 22, 2018Updated 7 years ago
- Various data stream/batch process demo with Apache Scala Spark πβ11Feb 28, 2020Updated 6 years ago
- A framework for creating composable and pluggable data processing pipelines using Apache Spark, and running them on a cluster.β47Aug 1, 2016Updated 9 years ago
- Scala API for Apache Spark SQL high-order functionsβ14Aug 4, 2023Updated 2 years ago
- Codeexample for a DDD workshop. Learning to build a CQRS / ES system with FSharpβ14Oct 19, 2016Updated 9 years ago
- low-level helpers for Apache Spark libraries and testsβ16Dec 29, 2018Updated 7 years ago
- Streaming Data Simulatorβ17Oct 12, 2020Updated 5 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)β16Oct 14, 2019Updated 6 years ago
- Yet another collector for Mesos. Aiming to be as robust and platform (Mesos version) independent as possible.β16Aug 13, 2015Updated 10 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.β73Mar 14, 2021Updated 4 years ago
- Building blocks and patterns for building data prep transformations and feature engineering in Spark.β16Mar 16, 2016Updated 9 years ago
- MapReduce performance testing using teragen and terasortβ18Aug 26, 2021Updated 4 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Sparkβ146Jan 26, 2016Updated 10 years ago
- ScalaIO 2014 Workshopβ25Oct 23, 2014Updated 11 years ago
- Affinity Propagation on Sparkβ20May 31, 2021Updated 4 years ago
- Sample processing code using Spark 2.1+ and Scalaβ51Jun 28, 2020Updated 5 years ago
- Java Pattern Matching libraryβ21Nov 9, 2023Updated 2 years ago
- Utilities for writing tests that use Apache Spark.β24Dec 29, 2018Updated 7 years ago
- Support for operating on images via Apache Sparkβ26Jun 12, 2023Updated 2 years ago
- Repository for advanced unit-testing with embedded kafka servicesβ25Dec 3, 2018Updated 7 years ago
- TestRail Reporter for all popular JavaScript (JS) and TypeScript (TS) based testing frameworks, enabling easy submission of test results β¦β16Updated this week
- Better JDBC wrapper for Scalaβ24Sep 10, 2022Updated 3 years ago
- Big Data Toolkit for the JVMβ146Nov 4, 2020Updated 5 years ago
- A connector for SingleStore and Sparkβ162Sep 24, 2025Updated 5 months ago
- An extension to the amazing Spark framework for better functional programming.β28May 19, 2016Updated 9 years ago
- Minimal example code for integration testing of Apache Kafka.β25Dec 10, 2017Updated 8 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multipleβ¦β26Jun 7, 2021Updated 4 years ago
- spark-sight: Spark performance at a glanceβ10Apr 6, 2023Updated 2 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Sparkβ29Nov 4, 2024Updated last year
- Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.β70May 8, 2023Updated 2 years ago
- Source code for SIMD benchmarks and experiments in Javaβ32Jun 30, 2017Updated 8 years ago
- β‘οΈ Actions and Reducer Utilities for NGRXβ10Oct 17, 2019Updated 6 years ago
- β14Nov 10, 2025Updated 3 months ago
- β11Apr 28, 2023Updated 2 years ago
- Type-safe SQL builder for Scalaβ30Jul 18, 2019Updated 6 years ago
- MLeap: Deploy ML Pipelines to Productionβ1,536Jan 12, 2026Updated last month
- A collection of Apache Parquet add-on modulesβ30Updated this week