qubole/spark-acid

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qubole/spark-acid)

qubole / spark-acid

ACID Data Source for Apache Spark based on Hive ACID

☆97

Alternatives and similar repositories for spark-acid

Users that are interested in spark-acid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qubole / streaminglens
View on GitHub
Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
☆17Jan 21, 2020Updated 6 years ago
qubole / spark-state-store
View on GitHub
Rocksdb state storage implementation for Structured Streaming.
☆17Oct 21, 2020Updated 5 years ago
oracle / spark-oracle
View on GitHub
On the fly, translation of Spark programs to run natively on your Oracle DB. Your Spark programs require no changes.
☆36Apr 15, 2025Updated last year
hortonworks-spark / spark-llap
View on GitHub
☆102Mar 23, 2020Updated 6 years ago
yaooqinn / spark-postgres
View on GitHub
PostgreSQL and GreenPlum Data Source for Apache Spark
☆35May 6, 2026Updated 2 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated last month
qubole / sparklens
View on GitHub
Qubole Sparklens tool for performance tuning Apache Spark
☆592Jun 26, 2024Updated 2 years ago
chermenin / spark-states
View on GitHub
Custom state store providers for Apache Spark
☆92Feb 14, 2025Updated last year
youngwookim / awesome-presto
View on GitHub
A curated list of awesome PrestoDB / Trino software, libraries, tools and resources
☆18Jun 28, 2021Updated 5 years ago
aravinthsci / Spark_Delta_Lake
View on GitHub
Delta Lake Examples
☆11Apr 24, 2020Updated 6 years ago
rbrush / kite-apps
View on GitHub
Prescriptive Applications over Kite and Hadoop
☆12Oct 14, 2015Updated 10 years ago
YotpoLtd / metorikku
View on GitHub
A simplified, lightweight ETL Framework based on Apache Spark
☆588Jan 24, 2024Updated 2 years ago
ZuInnoTe / spark-hadoopoffice-ds
View on GitHub
A Spark datasource for the HadoopOffice library
☆36Sep 29, 2025Updated 9 months ago
ExpediaGroup / shunting-yard
View on GitHub
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Oct 11, 2021Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
microsoft / hyperspace
View on GitHub
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
☆430Jan 14, 2022Updated 4 years ago
HeartSaVioR / spark-state-tools
View on GitHub
Spark Structured Streaming State Tools
☆34Jul 3, 2020Updated 6 years ago
cerndb / SparkPlugins
View on GitHub
Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…
☆96May 11, 2026Updated 2 months ago
MemVerge / splash
View on GitHub
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
☆131Dec 19, 2024Updated last year
bakdata / rebalancing-demo
View on GitHub
Repository that showcases problems with Kafka rebalancing and explains how to fix them. Please visit our blog article to learn what Kafka…
☆12Aug 21, 2020Updated 5 years ago
hortonworks / cb-cli
View on GitHub
☆12Jun 26, 2023Updated 3 years ago
Netflix / iceberg
View on GitHub
Iceberg is a table format for large, slow-moving tabular data
☆494Apr 10, 2023Updated 3 years ago
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
risingwavelabs / sqlparser-rs
View on GitHub
Extensible SQL Lexer and Parser for Rust
☆12Dec 22, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
apache / hbase-connectors
View on GitHub
Apache HBase Connectors
☆246Jul 13, 2026Updated last week
hortonworks / streamline
View on GitHub
StreamLine - Streaming Analytics
☆167Aug 27, 2023Updated 2 years ago
bmc / spark-hive-udf
View on GitHub
Example project showing how to use Hive UDFs in Apache Spark
☆55Apr 23, 2019Updated 7 years ago
eastcirclek / flink-service-discovery
View on GitHub
Discover Flink clusters on Hadoop YARN for Prometheus
☆23Aug 5, 2020Updated 5 years ago
ExpediaGroup / circus-train
View on GitHub
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆93Mar 5, 2024Updated 2 years ago
hortonworks-spark / cloud-integration
View on GitHub
Spark cloud integration: tests, cloud committers and more
☆20Jan 30, 2025Updated last year
querifylabs / querifylabs-blog
View on GitHub
Code samples from blogs posts https://www.querifylabs.com/blog
☆28Dec 13, 2021Updated 4 years ago
sakserv / hadoop-mini-clusters
View on GitHub
hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
☆296Jan 2, 2023Updated 3 years ago
assafmendelson / DataSourceV2
View on GitHub
☆23Oct 8, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yaooqinn / spark-ranger
View on GitHub
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.
☆59Nov 11, 2021Updated 4 years ago
cloudera / flink-tutorials
View on GitHub
☆204Jun 26, 2026Updated 3 weeks ago
Impetus / jumbune
View on GitHub
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…
☆73Jan 1, 2023Updated 3 years ago
milinda / samza-sql
View on GitHub
SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka
☆30Jun 8, 2016Updated 10 years ago
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
airbnb / sputnik
View on GitHub
☆64Nov 8, 2019Updated 6 years ago
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago