Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-disks.
☆21Mar 15, 2024Updated 2 years ago
Alternatives and similar repositories for remote-shuffle
Users that are interested in remote-shuffle are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Hadoop InputFormat for http://druid.io/☆10Oct 26, 2016Updated 9 years ago
- Remote shuffle service for Apache Spark to store shuffle data on remote servers.☆335Sep 29, 2023Updated 2 years ago
- Ted is a line oriented text editor and formatter☆12Jun 29, 2020Updated 5 years ago
- An ambient sound generator using free sounds from BBC Sounds Effects☆14Dec 3, 2023Updated 2 years ago
- Html Content / Article Extractor in Scala☆18May 23, 2018Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Jan 3, 2023Updated 3 years ago
- ☆12Apr 7, 2025Updated last year
- ☆18May 7, 2026Updated last month
- SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.☆136Mar 6, 2023Updated 3 years ago
- Java implementation of the SCRAM SASL for both server and client plus examples☆17Apr 18, 2021Updated 5 years ago
- 项目中保留了向开源社区提交过的patch☆16Oct 22, 2017Updated 8 years ago
- Apache Spark - A unified analytics engine for large-scale data processing☆16Jul 24, 2023Updated 2 years ago
- Mirror of Apache Hadoop common☆108Jul 8, 2020Updated 5 years ago
- HMM-guided metagenomic gene-targeted assembler using iterative de Bruijn graphs☆18Oct 3, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆256Feb 21, 2023Updated 3 years ago
- Cache File System optimized for columnar formats and object stores☆188Aug 11, 2022Updated 3 years ago
- A Trino ODBC driver☆15Jan 10, 2024Updated 2 years ago
- Plugin to accelerate Spark SQL with the NEC Vector Engine.☆19Aug 15, 2022Updated 3 years ago
- Deploy a simple Multi-Node Clickhouse Cluster with docker-compose in minutes.☆17Feb 11, 2022Updated 4 years ago
- Supporting code for Learning to Rank (LTR) presentation☆16Oct 11, 2018Updated 7 years ago
- Memtier benchmark front-end☆10May 9, 2023Updated 3 years ago
- Node.js kafka connect connector for prometheus☆13Dec 7, 2022Updated 3 years ago
- Open-source event streaming platform built on S3. Kafka-compatible APIs, built-in SQL engine, schema registry — one Rust binary replace…☆65May 21, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Performance Analysis Tool☆78Nov 25, 2025Updated 6 months ago
- DBT CLI MCP Server☆18Jun 26, 2025Updated 11 months ago
- A query predictor pipeline and service to predict resource usages of Presto queries☆14May 2, 2023Updated 3 years ago
- Fluent-Bit output plugin for Google Cloud Storage☆12Jul 13, 2021Updated 4 years ago
- An all in one Twitter video downloader☆12Jan 10, 2024Updated 2 years ago
- SFTP server which works on the top of HDFS,It is based on Apache sshd to access and operate HDFS through SFTP protocol☆15Aug 18, 2023Updated 2 years ago
- Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.☆24Jul 7, 2016Updated 9 years ago
- Easy way to send Finagle metrics to Codahale Metrics library☆42Apr 2, 2020Updated 6 years ago
- Scalable NameNode RPC Proxy for HDFS Federation☆88Apr 19, 2016Updated 10 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Golang library for using persistent memory☆29Oct 7, 2022Updated 3 years ago
- A reader that buffers ranged calls☆12May 17, 2022Updated 4 years ago
- Yet another gl-matrix: faster and smaller.☆17Jan 20, 2018Updated 8 years ago
- Mirror of Apache Ranger☆15Apr 5, 2024Updated 2 years ago
- Hadoop filesystem implementation for Aliyun OSS☆13Feb 14, 2016Updated 10 years ago
- Lightdash Community helm charts☆24May 26, 2026Updated 2 weeks ago
- Apache Spark build compatible with AWS Glue Data Catalog.☆19Aug 9, 2021Updated 4 years ago