A library for Spark DataFrame using MinIO Select API
☆101Sep 27, 2019Updated 6 years ago
Alternatives and similar repositories for spark-select
Users that are interested in spark-select are comparing it to the libraries listed below
Sorting:
- Depreciated in favor of datalake-kubernetes. Collection of Kubernetes Big Data ecosystem products helm charts☆11Aug 9, 2018Updated 7 years ago
- Find bottlenecks in distributed network☆23Dec 8, 2020Updated 5 years ago
- Collection of docker images, helm charts and other tools needed to build DataLake on Kubernetes.☆13Oct 19, 2018Updated 7 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Sep 4, 2023Updated 2 years ago
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 4 years ago
- Advanced fold methods for Kotlin☆12Updated this week
- Adaptive File Source Connector for Spark, optimised for reading from object stores☆15Oct 18, 2022Updated 3 years ago
- ☆11Feb 24, 2022Updated 4 years ago
- A Prometheus exporter for Minio cloud storage server☆23Apr 24, 2018Updated 7 years ago
- FluxCD and Express.js GitOps tutorial for Civo☆80Jan 29, 2020Updated 6 years ago
- Datalog implementation in Scala.☆12Jun 17, 2014Updated 11 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 2 years ago
- Hyper-Scale Machine Learning with MinIO and TensorFlow☆11Mar 25, 2023Updated 2 years ago
- Authenticated encryption for streams in Go☆30Dec 18, 2023Updated 2 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆187Oct 15, 2025Updated 4 months ago
- Local syncing package with support for timeouts. This package offers both a sync.Mutex and sync.RWMutex compatible interface.☆18Sep 20, 2019Updated 6 years ago
- Minio Object Storage Server☆10Nov 19, 2019Updated 6 years ago
- Support files for Kublr Demo Scenarios☆14Dec 6, 2022Updated 3 years ago
- Graphite integration for Kafka☆15Jun 5, 2018Updated 7 years ago
- Terraform script for launching multiple EMR clusters for training purposes.☆16Oct 30, 2025Updated 4 months ago
- Go Client for Hive Metastore☆14Dec 18, 2022Updated 3 years ago
- OpenAPI 3.2, 3.1 & 3.0 Parser & JSON Schema Validator, Java☆21Updated this week
- Generating Federated GraphQL API's from Datasources with Apache Calcite☆37Feb 21, 2022Updated 4 years ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆94May 9, 2025Updated 9 months ago
- A Scala client for Druid☆16Jan 12, 2017Updated 9 years ago
- Asynchronously writes journal and snapshot entries to configured R2DBC databases so that Apache Pekko Actors can recover state☆19Updated this week
- Akka plugin to collect various data about actors☆17Aug 19, 2024Updated last year
- A custom ContentRepository implementation for NiFi to persist data to MinIO Object Storage☆35Jul 15, 2022Updated 3 years ago
- Spark metrics related custom classes and sinks (e.g. Prometheus)☆188Aug 2, 2022Updated 3 years ago
- Repository for the Spark-Vector connector☆20Jul 7, 2021Updated 4 years ago
- Task Metrics Explorer☆14Apr 2, 2019Updated 6 years ago
- ☆39Mar 4, 2019Updated 6 years ago
- ☆40May 16, 2023Updated 2 years ago
- 🎮 Notebook Enterprise Summit☆18Jun 15, 2021Updated 4 years ago
- Slides for our 2015 event☆13May 14, 2015Updated 10 years ago
- Data monitoring tool, monitors the result, not the run☆16Dec 16, 2021Updated 4 years ago
- Rust LLVM Practises☆17Dec 29, 2020Updated 5 years ago
- An example of kubernetes scheduler extender☆15Apr 12, 2019Updated 6 years ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Jan 3, 2023Updated 3 years ago