minio/spark-select

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/minio/spark-select)

minio / spark-select

A library for Spark DataFrame using MinIO Select API

☆102

Alternatives and similar repositories for spark-select

Users that are interested in spark-select are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mapreducelab / datalake-kubernetes
View on GitHub
Collection of docker images, helm charts and other tools needed to build DataLake on Kubernetes.
☆13Oct 19, 2018Updated 7 years ago
mapreducelab / bigdata-helm-charts
View on GitHub
Depreciated in favor of datalake-kubernetes. Collection of Kubernetes Big Data ecosystem products helm charts
☆11Aug 9, 2018Updated 7 years ago
minio / bottlenet
View on GitHub
Find bottlenecks in distributed network
☆23Dec 8, 2020Updated 5 years ago
yaooqinn / itachi
View on GitHub
A library that brings useful functions from various modern database management systems to Apache Spark
☆63Sep 4, 2023Updated 2 years ago
minio / radio
View on GitHub
Redundant Array of Distributed Independent Objectstores in short RADIO performs synchronous mirroring, erasure coding across multiple obj…
☆26Jun 9, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
minio / m3
View on GitHub
MinIO Kubernetes Cloud
☆28Jul 1, 2020Updated 6 years ago
innoverio / lakefs-trino-iceberg
View on GitHub
☆11Feb 24, 2022Updated 4 years ago
joe-pll / minio-exporter
View on GitHub
A Prometheus exporter for Minio cloud storage server
☆23Apr 24, 2018Updated 8 years ago
MrPowers / bebe
View on GitHub
Filling in the Spark function gaps across APIs
☆50Apr 14, 2021Updated 5 years ago
Spratiher9 / JumpSpark
View on GitHub
JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.
☆10May 12, 2023Updated 3 years ago
minio / pkg
View on GitHub
Repository to hold all the common packages imported by MinIO projects
☆32Updated this week
minio / lsync
View on GitHub
Local syncing package with support for timeouts. This package offers both a sync.Mutex and sync.RWMutex compatible interface.
☆17Sep 20, 2019Updated 6 years ago
minio / homebrew-stable
View on GitHub
Homebrew tap for MinIO
☆20Sep 7, 2025Updated 10 months ago
databricks / tableau-connector
View on GitHub
☆13Mar 24, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hunters-ai / spark-adaptive-file-connector
View on GitHub
Adaptive File Source Connector for Spark, optimised for reading from object stores
☆15Oct 18, 2022Updated 3 years ago
AlexMercedCoder / Pangolin
View on GitHub
Pangolin is an Open-Source MIT Licensed Data Lakehouse Catalog in RUST with Iceberg REST Catalog Support
☆17Jan 2, 2026Updated 6 months ago
minio / minio-ruby
View on GitHub
MinIO Client SDK for Ruby
☆27Apr 25, 2020Updated 6 years ago
KwintenP / rxjs-operators-from-scratch
View on GitHub
Create all the RxJS operators from scratch
☆14Mar 19, 2019Updated 7 years ago
swoop-inc / spark-alchemy
View on GitHub
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
☆191Oct 15, 2025Updated 9 months ago
primecloudlabs / kubernetes-on-premise
View on GitHub
☆61Jul 12, 2026Updated last week
minio / nifi-minio
View on GitHub
A custom ContentRepository implementation for NiFi to persist data to MinIO Object Storage
☆35Jul 15, 2022Updated 4 years ago
banzaicloud / spark-metrics
View on GitHub
Spark metrics related custom classes and sinks (e.g. Prometheus)
☆186Aug 2, 2022Updated 3 years ago
G-Research / spark-extension
View on GitHub
A library that provides useful extensions to Apache Spark and PySpark.
☆239Jul 1, 2026Updated 3 weeks ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
dimajix / terraform-emr-training
View on GitHub
Terraform script for launching multiple EMR clusters for training purposes.
☆16Oct 30, 2025Updated 8 months ago
minio / dperf
View on GitHub
Drive performance measurement tool
☆77Dec 29, 2025Updated 6 months ago
Databeans / lighthouse
View on GitHub
Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…
☆10Jul 31, 2023Updated 2 years ago
tencentyun / flink-cos-fs
View on GitHub
Flink-cos-fs 是腾讯云对象存储系统COS针对Flink的文件系统实现，并且支持了recoverwriter接口。
☆36Dec 23, 2025Updated 7 months ago
Kyligence / kylin-tpch
View on GitHub
Run TPCH Benchmark on Apache Kylin
☆22Jan 24, 2022Updated 4 years ago
akolb1 / gometastore
View on GitHub
Go Client for Hive Metastore
☆14Dec 18, 2022Updated 3 years ago
viaduct-ai / docker-spark-k8s-aws
View on GitHub
Docker image for running Spark 3 on Kubernetes on AWS
☆26May 26, 2021Updated 5 years ago
kublr / demos
View on GitHub
Support files for Kublr Demo Scenarios
☆14Dec 6, 2022Updated 3 years ago
minio / kes
View on GitHub
[Deprecated] Key Encryption Server
☆498May 27, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
playframework / play-file-watch
View on GitHub
This is the Play File Watch library
☆20Updated this week
Ansuman-rath / DevOps_Roadmap_Project
View on GitHub
DevOps Projects undertaken in Roadmap.sh
☆16Aug 19, 2025Updated 11 months ago
obaidsalikeen / storm-marathon
View on GitHub
Apache Storm 0.9.3-rc1 Docker cluster deployed on Apache Mesos with Marathon.
☆11Jan 5, 2015Updated 11 years ago
danhper / scalog
View on GitHub
Datalog implementation in Scala.
☆12Jun 17, 2014Updated 12 years ago
cloudcheflabs / dataroaster
View on GitHub
☆42May 16, 2023Updated 3 years ago
zhu260824 / cisco_vpn_client-
View on GitHub
思科vpn客户端
☆13Nov 24, 2016Updated 9 years ago
mrpowers-io / spark-daria
View on GitHub
Essential Spark extensions and helper methods ✨😲
☆767Jun 22, 2026Updated last month