xskipper-io/xskipper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xskipper-io/xskipper)

xskipper-io / xskipper

An Extensible Data Skipping Framework

☆50

Alternatives and similar repositories for xskipper

Users that are interested in xskipper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

crdt-ibm-research / json-delta-crdt
View on GitHub
DSON - JSON CRDT Using Delta-Mutations
☆28Nov 9, 2021Updated 4 years ago
apache / kyuubi-client
View on GitHub
Client libraries of end users of Apache Kyuubi
☆11May 15, 2026Updated 2 months ago
IBM-Cloud / sql-query-clients
View on GitHub
Client samples for IBM Cloud SQL Query service
☆12Mar 4, 2024Updated 2 years ago
aljoscha / blog
View on GitHub
Thoughts on things I find interesting.
☆17Dec 19, 2024Updated last year
microsoft / lst-bench
View on GitHub
LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…
☆90Apr 10, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
oliversavio / hudi-data-lake-example
View on GitHub
☆10Apr 13, 2020Updated 6 years ago
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
apache / kyuubi-docker
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆16May 22, 2026Updated 2 months ago
JoshRosen / hive
View on GitHub
Mirror of Apache Hive
☆33Mar 16, 2020Updated 6 years ago
lancedb / ocra
View on GitHub
OCRA: Object-store Cache in Rust for All
☆19Sep 29, 2025Updated 9 months ago
zabetak / calcite-tutorial
View on GitHub
☆49Feb 14, 2022Updated 4 years ago
prestodb / benchto
View on GitHub
Framework for running macro benchmarks in a clustered environment
☆25Aug 29, 2022Updated 3 years ago
SaurabhChawla100 / spark-radiant
View on GitHub
Spark-Radiant is Apache Spark Performance and Cost Optimizer
☆25Dec 31, 2024Updated last year
maropu / spark-sql-flow-plugin
View on GitHub
Visualize column-level data lineage in Spark SQL
☆92May 13, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
frb502 / spark-skewed-join-hint
View on GitHub
SparkSQL自定义Hint优化器解决热点数据导致JOIN数据倾斜问题
☆48Jan 4, 2019Updated 7 years ago
allwefantasy / sql-code-intelligence
View on GitHub
sql code autocomplete
☆45Sep 2, 2020Updated 5 years ago
substrait-io / substrait-java
View on GitHub
☆101Updated this week
swiftonfile / swiftonfile
View on GitHub
Swift-on-File has moved to Stackforge:
☆21Nov 28, 2017Updated 8 years ago
oap-project / sql-ds-cache
View on GitHub
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
☆37Jan 3, 2023Updated 3 years ago
EnterpriseDB / pg-airman-mcp
View on GitHub
☆17Jun 4, 2026Updated last month
kubeflow / mcp-apache-spark-history-server
View on GitHub
MCP Server and CLI for Apache Spark History Server. Debug Spark applications from AI agents, scripts, or the terminal.
☆183Jul 16, 2026Updated last week
uber / RemoteShuffleService
View on GitHub
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
☆335Sep 29, 2023Updated 2 years ago
boostscale / velox4j
View on GitHub
Community Java bindings for https://github.com/facebookincubator/velox
☆43Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
wanqiufeng / hudi-learn
View on GitHub
☆12Jan 1, 2021Updated 5 years ago
fybrik / fybrik
View on GitHub
Fybrik
☆130Sep 7, 2025Updated 10 months ago
kongo2002 / blockade
View on GitHub
Docker-based utility for testing network failures and partitions in distributed applications
☆10Oct 4, 2016Updated 9 years ago
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
oap-project / Gluten-Trino
View on GitHub
Gluten: Plugin to Boost Trino's Performance
☆75Oct 25, 2023Updated 2 years ago
MeetYouDevs / hbase-manager
View on GitHub
☆23Sep 25, 2024Updated last year
adobe / lake-pulse
View on GitHub
A Rust library for analyzing data lake table health — checking the pulse — across multiple formats (Delta Lake, Apache Iceberg, Apache Hu…
☆20Jul 11, 2026Updated last week
CoxAutomotiveDataSolutions / spark-distcp
View on GitHub
A re-implementation of Hadoop DistCP in Apache Spark
☆47Dec 20, 2023Updated 2 years ago
facebookincubator / axiom
View on GitHub
Axiom is a set of reusable and extensible components designed to be compatible with Velox. Its primary purpose is to simplify the process…
☆79Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
NetEase / spark-alarm
View on GitHub
Alerting and monitoring tool for Apache Spark
☆23May 20, 2022Updated 4 years ago
Tencent / Firestorm
View on GitHub
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shu…
☆256Apr 7, 2023Updated 3 years ago
alibaba / SparkCube
View on GitHub
SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.
☆136Mar 6, 2023Updated 3 years ago
lyft / presto-gateway
View on GitHub
A load balancer / proxy / gateway for prestodb
☆359Jul 25, 2024Updated last year
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 3 weeks ago
qubole / presto-udfs
View on GitHub
Plugin for Presto to allow addition of user functions easily
☆119Mar 31, 2021Updated 5 years ago
cloudera / dbt-spark-livy
View on GitHub
The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy
☆12Mar 30, 2023Updated 3 years ago