Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
☆115May 17, 2024Updated last year
Alternatives and similar repositories for stocator
Users that are interested in stocator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fybrik☆132Sep 7, 2025Updated 6 months ago
- Mirror of Apache crail (Incubating)☆150Jul 3, 2022Updated 3 years ago
- [Archived] A Fast Multi-tiered Distributed Storage System based on User-Level I/O☆74Mar 2, 2018Updated 8 years ago
- A terminal emulator embedded in a IPython/Jupyter notebook.☆27Feb 3, 2022Updated 4 years ago
- Fast I/O plugins for Spark☆41Dec 14, 2020Updated 5 years ago
- Elephant Twin is a framework for creating indexes in Hadoop☆98Oct 12, 2020Updated 5 years ago
- A Gateway for connecting application services in different domains, networks, and cloud infrastructures☆23Feb 1, 2026Updated last month
- Cache File System optimized for columnar formats and object stores☆187Aug 11, 2022Updated 3 years ago
- Allow overrrided prestashop module php☆25Jun 5, 2014Updated 11 years ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- ☆13Dec 2, 2025Updated 3 months ago
- riemann tool for cassandra☆31May 19, 2016Updated 9 years ago
- PySpark Notebook and Shiny App for Demo☆34Mar 24, 2017Updated 9 years ago
- A library for financial and time series calculations on Apache Spark☆28Feb 2, 2016Updated 10 years ago
- A Kafka JMX configuration file☆20Jul 9, 2018Updated 7 years ago
- Python Helper library for Jupyter Notebooks☆1,041Feb 16, 2021Updated 5 years ago
- Benchmark Suite for Apache Spark☆240Apr 12, 2023Updated 2 years ago
- ☆24Feb 4, 2021Updated 5 years ago
- Hadoop YARN & MapReduce Memory Calculator☆13Nov 9, 2015Updated 10 years ago
- Flash cache solution iostash☆11Jun 23, 2016Updated 9 years ago
- Active learning of GP hyperparameters following Garnett, et al., "Active Learning of Linear Embeddings for Gaussian Processes," (UAI 2014…☆16Aug 4, 2017Updated 8 years ago
- High performance HBase / Spark SQL engine☆28Jul 7, 2022Updated 3 years ago
- hanythingondemand provides a set of scripts to easily set up an ad-hoc Hadoop cluster through PBS jobs☆12Jul 2, 2019Updated 6 years ago
- A quotation-based Scala DSL for scalable data analysis.☆63Jul 7, 2022Updated 3 years ago
- Jupyter Hub Support in VS Code☆17Updated this week
- A library for exporting Spark ML models and pipelines to PFA☆55Nov 21, 2018Updated 7 years ago
- A file system backed by AntidoteDB.☆13Jun 10, 2021Updated 4 years ago
- How to setup the USG to use ProtonVPN☆12Nov 21, 2018Updated 7 years ago
- NVMesh Container Storage Interface (CSI) Driver for Kubernetes☆12Oct 7, 2024Updated last year
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- Apache NiFi Python Extensions☆25Nov 13, 2024Updated last year
- Additional useful algorithms that can be used with spark.☆24Dec 24, 2014Updated 11 years ago
- RGW PubSub API Clients☆14Dec 4, 2019Updated 6 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆30Feb 1, 2016Updated 10 years ago
- H3 is an embedded object store in C, Python, and Java☆12Jul 7, 2021Updated 4 years ago
- Java event logs collector for hadoop and frameworks☆41Mar 25, 2025Updated last year
- Dockerfile for a base Logstash image to be extended by others (allow to install plug-ins, change configuration, etc.)☆10Jan 16, 2017Updated 9 years ago
- Next-generation web analytics processing with Scala, Spark, and Parquet.☆330Mar 28, 2015Updated 10 years ago