Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
☆115May 17, 2024Updated last year
Alternatives and similar repositories for stocator
Users that are interested in stocator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Facilitates Data I/O between Spark and IBM Object Storage services.☆10Feb 26, 2019Updated 7 years ago
- WARNING: This repository is no longer maintained The Insights for Twitter service from IBM Cloud has been sunset. This repository will n…☆11Apr 10, 2019Updated 7 years ago
- ☆36Mar 18, 2026Updated 3 weeks ago
- Lithops-based Serverless implementation of the METASPACE spatial metabolomics annotation pipeline☆12Jul 6, 2023Updated 2 years ago
- Mirror of Apache crail (Incubating)☆151Jul 3, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Netezza Connector for Apache Spark☆13Sep 10, 2018Updated 7 years ago
- [Archived] A Fast Multi-tiered Distributed Storage System based on User-Level I/O☆74Mar 2, 2018Updated 8 years ago
- Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix☆23Sep 27, 2017Updated 8 years ago
- Mirror of Apache Bahir☆336Jul 7, 2023Updated 2 years ago
- Elephant Twin is a framework for creating indexes in Hadoop☆98Oct 12, 2020Updated 5 years ago
- A Gateway for connecting application services in different domains, networks, and cloud infrastructures☆23Feb 1, 2026Updated 2 months ago
- MLeap allows for easily putting Spark ML pipelines into production☆78Oct 27, 2016Updated 9 years ago
- Cache File System optimized for columnar formats and object stores☆188Aug 11, 2022Updated 3 years ago
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Mirror of Apache Toree (Incubating)☆749Apr 2, 2026Updated last week
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Enabling Spark Optimization through Cross-stack Monitoring and Visualization☆47Aug 23, 2017Updated 8 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- ☆13Dec 2, 2025Updated 4 months ago
- riemann tool for cassandra☆31May 19, 2016Updated 9 years ago
- PySpark Notebook and Shiny App for Demo☆34Mar 24, 2017Updated 9 years ago
- A library for financial and time series calculations on Apache Spark☆28Feb 2, 2016Updated 10 years ago
- Python Helper library for Jupyter Notebooks☆1,041Feb 16, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Benchmark Suite for Apache Spark☆240Apr 12, 2023Updated 3 years ago
- Java API for libaio☆14Jan 10, 2022Updated 4 years ago
- Hadoop YARN & MapReduce Memory Calculator☆13Nov 9, 2015Updated 10 years ago
- ☆11May 16, 2022Updated 3 years ago
- hanythingondemand provides a set of scripts to easily set up an ad-hoc Hadoop cluster through PBS jobs☆12Jul 2, 2019Updated 6 years ago
- A quotation-based Scala DSL for scalable data analysis.☆64Jul 7, 2022Updated 3 years ago
- Jupyter Hub Support in VS Code☆17Apr 2, 2026Updated last week
- A library for exporting Spark ML models and pipelines to PFA☆55Nov 21, 2018Updated 7 years ago
- A persistent LSM key-value store. FloDB is designed to scale with the number of threads and memory size.☆26Mar 28, 2017Updated 9 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆11Jul 15, 2014Updated 11 years ago
- NVMesh Container Storage Interface (CSI) Driver for Kubernetes☆12Oct 7, 2024Updated last year
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- Apache NiFi Python Extensions☆26Nov 13, 2024Updated last year
- Additional useful algorithms that can be used with spark.☆24Dec 24, 2014Updated 11 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆30Feb 1, 2016Updated 10 years ago
- Stocks -> NiFi -> Kafka -> Profit☆14Nov 16, 2018Updated 7 years ago