spirom/spark-data-sources

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/spirom/spark-data-sources)

spirom / spark-data-sources

Developing Spark External Data Sources using the V2 API

☆49

Alternatives and similar repositories for spark-data-sources

Users that are interested in spark-data-sources are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

assafmendelson / DataSourceV2
View on GitHub
☆23Oct 8, 2018Updated 7 years ago
rymurr / flight-spark-source
View on GitHub
☆109Jul 5, 2023Updated 3 years ago
ExpediaGroup / hiveberg
View on GitHub
Demonstration of a Hive Input Format for Iceberg
☆26Mar 12, 2021Updated 5 years ago
aokolnychyi / spark-custom-datasource-example
View on GitHub
A sample implementation of the Spark Datasource API
☆23Apr 15, 2017Updated 9 years ago
shirukai / spark-structured-datasource
View on GitHub
Custom datasource about spark structure streaming
☆12Jan 29, 2019Updated 7 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
spirom / learning-spark-with-java
View on GitHub
Self-contained examples using Apache Spark with the functional features of Java 8
☆64Apr 8, 2018Updated 8 years ago
tmscarla / akka-big-data
View on GitHub
Implementation of a Big Data (batch and stream) distributed processing engine in Java using Akka actors.
☆12Feb 20, 2023Updated 3 years ago
Azure / azure-relay-node
View on GitHub
☁️Node.js library for Azure Relay Hybrid Connections
☆12Updated this week
brues / btc-demo
View on GitHub
比特币简单实现
☆12Jun 17, 2022Updated 4 years ago
phatak-dev / spark-3.0-examples
View on GitHub
Examples of Spark 3.0
☆44Nov 11, 2020Updated 5 years ago
spirom / spark-streaming-with-kafka
View on GitHub
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
☆196Apr 15, 2018Updated 8 years ago
avensolutions / cdc-at-scale-using-spark
View on GitHub
Scalable CDC Pattern Implemented using PySpark
☆18Oct 8, 2025Updated 9 months ago
nyukhalov / akka-http-actor-per-request
View on GitHub
Example akka application that uses the actor per request model
☆16Aug 21, 2017Updated 8 years ago
goodwillpunning / nodejs-sharing-client
View on GitHub
A Node.js connector for Delta Sharing.
☆12Apr 3, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
cguegi / azure-databricks-airflow-example
View on GitHub
Example of orchestrating dependent Databricks jobs using Airflow
☆11Dec 19, 2019Updated 6 years ago
BryanCutler / SparkArrowFlight
View on GitHub
Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients
☆37Mar 9, 2021Updated 5 years ago
getyourguide / db-rocket
View on GitHub
Keep your local python scripts installed and in sync with a databricks notebook. Shortens the feedback loop to develop projects using a h…
☆16Jun 16, 2025Updated last year
trek10inc / lambda-local-cache
View on GitHub
☆10Jul 5, 2016Updated 10 years ago
zrlio / parquet-generator
View on GitHub
Parquet file generator
☆22Apr 17, 2018Updated 8 years ago
ShujianQian / epic-eval
View on GitHub
☆10May 15, 2024Updated 2 years ago
jerryshao / spark-kafka-0-8-sql
View on GitHub
Spark Structured Streaming Kafka 0.8 Source Implementation
☆35Apr 27, 2017Updated 9 years ago
jeoffreylim / maelstrom
View on GitHub
Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …
☆21Feb 6, 2017Updated 9 years ago
stanzhai / jvm-exercise
View on GitHub
JVM related exercises
☆11Jul 16, 2017Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
phatak-dev / spark2.0-examples
View on GitHub
Examples of Spark 2.0
☆213Aug 11, 2021Updated 4 years ago
amient / affinity
View on GitHub
Library and a Framework for building fast, scalable, fault-tolerant Data APIs based on Akka, Avro, ZooKeeper and Kafka
☆25Oct 16, 2020Updated 5 years ago
banzaicloud / spark-metrics
View on GitHub
Spark metrics related custom classes and sinks (e.g. Prometheus)
☆186Aug 2, 2022Updated 3 years ago
phatak-dev / flink-examples
View on GitHub
Flink Examples
☆39Apr 27, 2016Updated 10 years ago
moyano83 / High-Performance-Spark
View on GitHub
☆31Oct 14, 2019Updated 6 years ago
anilshanbhag / gpu-compression
View on GitHub
☆20May 5, 2024Updated 2 years ago
kokobing / iris-middleware-casbin
View on GitHub
go iris casbin redis
☆10Feb 25, 2020Updated 6 years ago
tweetmagik / spark-yarn
View on GitHub
Launch Spark clusters on YARN
☆24Aug 29, 2011Updated 14 years ago
thesquelched / spark-lineage
View on GitHub
Spark SQL listener to record lineage information
☆28Jan 24, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
TU-Berlin-DIMA / fast-interconnects
View on GitHub
Research project on scaling GPU-accelerated data management to large data volumes. Code base of two SIGMOD papers.
☆17Jun 14, 2022Updated 4 years ago
FINRAOS / herd
View on GitHub
Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabyt…
☆140Oct 1, 2022Updated 3 years ago
amazon-archives / amazon-cognito-streams-sample
View on GitHub
Sample demonstrating consuming Amazon Cognito Streams
☆10Jun 15, 2020Updated 6 years ago
lizhitao0923 / ansible-hadoop
View on GitHub
Ansible playbooks to help to deploy Apache Hadoop,Spark,Storm,Zookeeper,Elasticsearch,Azkaban,Flume,Hbase,Kafka,Kibana,Logstash
☆10Mar 21, 2017Updated 9 years ago
hjacobs / connexion-example-redis-kubernetes
View on GitHub
Connexion Example REST Service with Redis Store
☆23Oct 21, 2019Updated 6 years ago
Henning1 / resql
View on GitHub
Low-latency query compiler
☆17Jun 3, 2022Updated 4 years ago
fredrikhgrelland / data-mesh
View on GitHub
A cloud native data mesh implementation
☆12Jan 15, 2021Updated 5 years ago