audienceproject/spark-dynamodb

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/audienceproject/spark-dynamodb)

audienceproject / spark-dynamodb

Plug-and-play implementation of an Apache Spark custom data source for AWS DynamoDB.

☆174

Alternatives and similar repositories for spark-dynamodb

Users that are interested in spark-dynamodb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

traviscrawford / spark-dynamodb
View on GitHub
DynamoDB data source for Apache Spark
☆95Sep 2, 2021Updated 4 years ago
awslabs / emr-dynamodb-connector
View on GitHub
Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB
☆228Apr 8, 2026Updated 3 months ago
audienceproject / crossbow
View on GitHub
Single node, in-memory DataFrame analytics library.
☆44Mar 6, 2026Updated 4 months ago
Marcus-L / serverless-mailgun-slack
View on GitHub
A Serverless function for posting to a Slack Webhook in response to a Mailgun route
☆11Oct 12, 2016Updated 9 years ago
KyloIO / kylo
View on GitHub
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…
☆22Jan 10, 2019Updated 7 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
airflow-plugins / mysql_plugin
View on GitHub
☆16Apr 25, 2019Updated 7 years ago
awsdocs / amazon-emr-management-guide
View on GitHub
The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…
☆62Jun 15, 2023Updated 3 years ago
vinicelms / emr-monitoring-prometheus-grafana
View on GitHub
Project to concentrate files and settings for AWS EMR monitoring. Source: https://aws.amazon.com/blogs/big-data/monitor-and-optimize-anal…
☆15Oct 11, 2024Updated last year
scanamo / scanamo
View on GitHub
Simpler DynamoDB access for Scala
☆318May 25, 2026Updated last month
freneticdisc / oracle-fmw-tooling
View on GitHub
Project to build WebLogic Domains with Oracle Fusion Middleware 12c components using scripts.
☆12Jul 13, 2018Updated 8 years ago
spark-redshift-community / spark-redshift
View on GitHub
Performant Redshift data source for Apache Spark
☆140Jun 5, 2026Updated last month
minio / spark-select
View on GitHub
A library for Spark DataFrame using MinIO Select API
☆102Sep 27, 2019Updated 6 years ago
qubole / kinesis-sql
View on GitHub
Kinesis Connector for Structured Streaming
☆139Jul 2, 2024Updated 2 years ago
steveloughran / zero-rename-committer
View on GitHub
Paper: A Zero-rename committer for object stores
☆20Nov 7, 2025Updated 8 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
azavea / terraform-aws-emr-cluster
View on GitHub
A Terraform module to create an Amazon Web Services (AWS) Elastic MapReduce (EMR) cluster.
☆39Oct 21, 2019Updated 6 years ago
aws-samples / aws-emr-apache-ranger
View on GitHub
☆24Oct 3, 2023Updated 2 years ago
SETL-Framework / setl
View on GitHub
A simple Spark-powered ETL framework that just works 🍺
☆186Oct 2, 2025Updated 9 months ago
softprops / unisockets
View on GitHub
unix domain sockets that look just like tcp sockets
☆11Jun 21, 2018Updated 8 years ago
aws-samples / kda-flink-app-autoscaling
View on GitHub
This repo demonstrates how to use AWS application auto-scaling to implement custom-scaling in your Kinesis Data Analytics for Apache Flin…
☆19Feb 21, 2025Updated last year
aws-samples / aws-concurrent-data-orchestration-pipeline-emr-livy
View on GitHub
This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…
☆76Oct 30, 2018Updated 7 years ago
insightfuls / better-https-proxy-agent
View on GitHub
An agent for HTTPS through an HTTP(S) proxy server using the CONNECT method
☆12Dec 30, 2022Updated 3 years ago
lokkju / github-action-sbt
View on GitHub
Github Actions support for building SBT projects
☆14Feb 9, 2021Updated 5 years ago
Spratiher9 / JumpSpark
View on GitHub
JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.
☆10May 12, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
didil / serverless-lambda-sns-example
View on GitHub
Serverless Lambda PubSub via SNS Example
☆35Jul 23, 2018Updated 8 years ago
ahujaraman / live_log_analyzer_spark
View on GitHub
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
☆21Jan 30, 2019Updated 7 years ago
pavelbogomolenko / keycloak-custom-password-hash
View on GitHub
Example of custom password hash SPI for Keycloak
☆28Mar 6, 2017Updated 9 years ago
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,636Updated this week
msfidelis / serverless-pipeline
View on GitHub
Pipeline to build, test and deploy Serverless Framework Projects with CodeBuild and CodePipeline on AWS using Terraform.
☆42Mar 12, 2019Updated 7 years ago
kaggler-tv / codes
View on GitHub
☆10Apr 8, 2020Updated 6 years ago
d2iq-archive / jackson-case-class-module
View on GitHub
Deserialization support for Scala case classes, including proper handling of default values.
☆11Apr 5, 2016Updated 10 years ago
IainHull / resttest
View on GitHub
A lightweight Scala DSL for system testing REST web services
☆24Jun 19, 2014Updated 12 years ago
aws-samples / flink-stream-processing-refarch
View on GitHub
Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.
☆71Feb 21, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
aws-samples / emr-bootstrap-actions
View on GitHub
This repository hold the Amazon Elastic MapReduce sample bootstrap actions
☆613Jun 5, 2023Updated 3 years ago
Vilos92 / polynote
View on GitHub
Unofficial Docker Image for Polynote https://polynote.org/
☆31Mar 12, 2022Updated 4 years ago
lensesio / kafka-connect-common
View on GitHub
Common components used across the datamountaineer kafka connect connectors
☆21Feb 12, 2021Updated 5 years ago
grahamar / sbt-dynamodb
View on GitHub
DynamoDB Local SBT plugin - NO LONGER MAINTAINED, SEE:
☆14Sep 28, 2015Updated 10 years ago
chilang / intellij-zeppelin
View on GitHub
Edit code in IntelliJ, eval/run in Zeppelin notebook
☆18Mar 17, 2019Updated 7 years ago
AbsaOSS / spline
View on GitHub
Data Lineage Tracking And Visualization Solution
☆663Updated this week
dharmeshkakadia / awesome-hive
View on GitHub
Everything about Apache Hive that is awesome
☆13Dec 16, 2020Updated 5 years ago