indix/sparkplug

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/indix/sparkplug)

indix / sparkplug

Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌

☆28

Alternatives and similar repositories for sparkplug

Users that are interested in sparkplug are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

indix / schemer
View on GitHub
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
☆116Mar 5, 2020Updated 6 years ago
amient / affinity
View on GitHub
Library and a Framework for building fast, scalable, fault-tolerant Data APIs based on Akka, Avro, ZooKeeper and Kafka
☆25Oct 16, 2020Updated 5 years ago
yeghishe / ammonite-modules
View on GitHub
☆14Jul 26, 2019Updated 6 years ago
indix / vasuki
View on GitHub
Scale GoCD Agents on demand with Docker
☆13Apr 15, 2018Updated 8 years ago
scalanlp / junto
View on GitHub
This toolkit provides an implementation of Modified Adsorption (MAD), a graph-based semi-supervised learning (SSL) algorithm.
☆24Jun 20, 2017Updated 9 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
AbsaOSS / spark-hofs
View on GitHub
Scala API for Apache Spark SQL high-order functions
☆15Aug 4, 2023Updated 2 years ago
KoddiDev / geocoder
View on GitHub
Google Maps geocoding library for Scala
☆12Oct 12, 2019Updated 6 years ago
AbsaOSS / atum
View on GitHub
A dynamic data completeness and accuracy library at enterprise scale for Apache Spark
☆30May 13, 2026Updated 2 months ago
RedisLabs / ReSearch
View on GitHub
Redis search and indexing in Java
☆16Sep 26, 2016Updated 9 years ago
jeoffreylim / maelstrom
View on GitHub
Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …
☆21Feb 6, 2017Updated 9 years ago
bluecolor / octopus
View on GitHub
Open source task scheduler with dependency management
☆15Jul 1, 2018Updated 8 years ago
spotify / hornet
View on GitHub
☆10Nov 15, 2016Updated 9 years ago
razie / diesel
View on GitHub
Scala, DSL, Rules based reactive workflows and Microservices
☆14Oct 20, 2025Updated 9 months ago
indix / rocks
View on GitHub
RocksDB Ops CLI
☆11Dec 17, 2016Updated 9 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Karasiq / mapdbutils
View on GitHub
Scala wrappers for MapDB
☆12Sep 9, 2017Updated 8 years ago
massung / scala-js-vue
View on GitHub
ScalaJS Facade for Vue.js
☆16Sep 10, 2017Updated 8 years ago
allenai / pipeline
View on GitHub
Library for building reproducible data pipelines to support experimentation
☆20Dec 16, 2015Updated 10 years ago
joshlemer / MultiIndex
View on GitHub
A Scala Collection for Multiple Access Patterns
☆12Oct 22, 2016Updated 9 years ago
martincooper / scala-datatable
View on GitHub
Immutable DataTable implementation in Scala
☆70Dec 30, 2019Updated 6 years ago
Salamahin / joinwiz
View on GitHub
Make your joins typesafe again
☆27Feb 5, 2026Updated 5 months ago
kerzok / ScalaBot
View on GitHub
☆13Nov 20, 2016Updated 9 years ago
indix / matsya
View on GitHub
Place ASGs on the right Spot Market
☆40Dec 27, 2016Updated 9 years ago
openmole / gridscale
View on GitHub
Scala library for accessing various file, batch systems, job schedulers and grid middlewares.
☆30Jun 24, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
liquidm / druid-dumbo
View on GitHub
☆21Mar 17, 2023Updated 3 years ago
UBOdin / mimir
View on GitHub
Data-ish exploration through SQL+Uncertainty
☆28Jun 20, 2026Updated last month
sparsecode / DaFlow
View on GitHub
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…
☆26Jun 7, 2021Updated 5 years ago
uber / uberscriptquery
View on GitHub
UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy
☆65Dec 17, 2023Updated 2 years ago
SciScala / NDScala
View on GitHub
N-dimensional / multi-dimensional arrays (tensors) in Scala 3. Think NumPy ndarray / PyTorch Tensor but type-safe over shapes, array/axis…
☆48Dec 22, 2022Updated 3 years ago
spoddutur / spark-streaming-monitoring-with-lightning
View on GitHub
Plot live-stats as graph from ApacheSpark application using Lightning-viz
☆18Jul 3, 2017Updated 9 years ago
dwickern / sbt-classloader-leak-prevention
View on GitHub
An sbt plugin to fix java.lang.OutOfMemoryError: Metaspace/PermGen errors during interactive sbt usage
☆14Feb 16, 2017Updated 9 years ago
markmo / featurestore
View on GitHub
Building blocks and patterns for building data prep transformations and feature engineering in Spark.
☆16Mar 16, 2016Updated 10 years ago
cascala / galileo
View on GitHub
Scala Math - Numerical (Matlab-like) and Symbolic (Mathematica-like) tool
☆72Nov 25, 2019Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sirthias / spliff
View on GitHub
Efficient diffing in Scala
☆60Nov 4, 2025Updated 8 months ago
jmd1011 / parquet-readers
View on GitHub
Apache Parquet reader in Scala without Apache Spark - developed at Purdue University
☆12Feb 17, 2017Updated 9 years ago
speedment / avro-mocker
View on GitHub
Generate mock data based on an Apache Avro schema and specific cardinality settings
☆10Apr 16, 2018Updated 8 years ago
dbis-ilm / piglet
View on GitHub
A compiler for Pig Latin to Spark and Flink.
☆24Nov 21, 2019Updated 6 years ago
decisionbrain / cplex-scala
View on GitHub
A scala library for IBM ILOG CPLEX
☆20Jan 27, 2020Updated 6 years ago
EmergentOrder / onnx-scala
View on GitHub
An ONNX (Open Neural Network eXchange) API and backend for typeful, functional deep learning and classical machine learning in Scala 3
☆149Feb 17, 2026Updated 5 months ago
michaelahlers / faker-scala
View on GitHub
Realistic sample value generators for Scala.
☆16Jul 4, 2024Updated 2 years ago