Scala API for Apache Spark SQL high-order functions
☆14Aug 4, 2023Updated 2 years ago
Alternatives and similar repositories for spark-hofs
Users that are interested in spark-hofs are comparing it to the libraries listed below
Sorting:
- Make Structs Easy (MSE)☆18Jun 22, 2020Updated 5 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- Extensible streaming ingestion pipeline on top of Apache Spark☆46Jul 17, 2025Updated 8 months ago
- Dynamic Conformance Engine☆32Oct 17, 2025Updated 5 months ago
- Resilient data pipeline framework running on Apache Spark☆26Updated this week
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 5 years ago
- ☆10May 16, 2022Updated 3 years ago
- Open source task scheduler with dependency management☆15Jul 1, 2018Updated 7 years ago
- Utilities for writing tests that use Apache Spark.☆24Dec 29, 2018Updated 7 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29May 15, 2020Updated 5 years ago
- Nested array transformation helper extensions for Apache Spark☆37Aug 4, 2023Updated 2 years ago
- Example Porter bundles☆14Oct 13, 2025Updated 5 months ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Avro SerDe for Apache Spark structured APIs.☆241Jun 10, 2025Updated 9 months ago
- Friendly, Scala like, Sequence interface☆12Jan 13, 2026Updated 2 months ago
- A sample monorepo of several Python libraries and commands, using Bazel as build system☆13Oct 11, 2017Updated 8 years ago
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆76Apr 24, 2024Updated last year
- Apache Spark ETL Utilities☆39Oct 23, 2024Updated last year
- ☆17May 17, 2023Updated 2 years ago
- A script to automate and simplify simple system tasks, such as service control, package control, system monitoring, pinging etc. This scr…☆10Nov 27, 2022Updated 3 years ago
- ☆14Feb 28, 2025Updated last year
- Command-line tool to find the nearest retail store☆10Jan 18, 2017Updated 9 years ago
- Deploy microk8s on OpenStack with MetalLB☆12Sep 28, 2022Updated 3 years ago
- An Ansible role to provision CentOS 7 LXC containers on Proxmox integrated with FreeIPA☆12Oct 12, 2023Updated 2 years ago
- Library and a Framework for building fast, scalable, fault-tolerant Data APIs based on Akka, Avro, ZooKeeper and Kafka☆24Oct 16, 2020Updated 5 years ago
- We Are Wizards Blog☆19Oct 31, 2016Updated 9 years ago
- Executable script for pony voice synthesis project☆11Jun 21, 2022Updated 3 years ago
- an open source dataworks platform☆21Jun 4, 2021Updated 4 years ago
- Cortex.dev ML Serving Client for Python with garbage API collection.☆15Apr 26, 2023Updated 2 years ago
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- OReilly's Escalate with Scala 3 Material☆13Jun 1, 2021Updated 4 years ago
- An open source enterprise data warehousing and analysis platform.☆22Nov 8, 2021Updated 4 years ago
- i3 status line generator☆12Apr 11, 2025Updated 11 months ago
- Make Raspberry Pi up and running in a few command☆19Apr 22, 2018Updated 7 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆161Oct 4, 2022Updated 3 years ago
- Efficiently automate your release note generation with 'generate-release-notes'. This GH action scans your target GitHub repository's iss…☆12Updated this week
- Apache Daffodil☆110Updated this week
- ☆11Jun 29, 2018Updated 7 years ago
- R COBOL DI (Data Integration) Package : Import COBOL CopyBook data files directly into R as properly structured data frames.☆15Aug 7, 2024Updated last year