cfregly / spark-dynamodb

[WIP] Spark-DynamoDB Data Sources API Implementation

☆9

Alternatives and similar repositories for spark-dynamodb

Users that are interested in spark-dynamodb are comparing it to the libraries listed below

Sorting:

VeritoneAlpha / jaws-spark-sql-rest
☆92Updated 8 years ago
tresata / spark-kafka
Low level integration of Spark and Kafka
☆130Updated 7 years ago
mesos / spark-ec2
[NOTE: Repository has moved to github.com/amplab/spark-ec2]
☆57Updated 9 years ago
intentmedia / mario
Functional, Typesafe, Declarative Data Pipelines
☆139Updated 7 years ago
kawaa / Beetest
A super simple utility for testing Apache Hive scripts locally for non-Java developers.
☆72Updated 8 years ago
spotify / hdfs2cass
Hadoop mapreduce job to bulk load data into Cassandra
☆75Updated 3 years ago
collectivemedia / modelmatrix
Sparse feature extraction with Spark
☆30Updated 6 years ago
bythebay / pipeline
Complete Pipeline Training at Big Data Scala By the Bay
☆71Updated 9 years ago
holdenk / elasticsearchspark
Elastic Search on Spark
☆112Updated 10 years ago
cfregly / spark-after-dark
☆24Updated 9 years ago
darkjh / scalaflow
Fluent Scala DSL for Google's Cloud Dataflow SDK
☆56Updated 9 years ago
rdblue / s3committer
Hadoop output committers for S3
☆109Updated 4 years ago
snowplow-archive / spark-example-project
A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
☆118Updated 9 years ago
tresata / spark-scalding
Use Cascading Taps and Scalding DSL with Spark
☆49Updated 8 years ago
maropu / hivemall-spark
A Hivemall wrapper for Spark
☆31Updated 9 years ago
ogrodnek / spark-plug
scala driver for launching Amazon EMR jobs
☆39Updated 9 years ago
TrueCar / mleap
MLeap allows for easily putting Spark ML pipelines into production
☆78Updated 8 years ago
adobe-research / spark-parquet-thrift-example
Example Spark project using Parquet as a columnar store with Thrift objects.
☆48Updated 10 years ago
SinghAsDev / pankh
☆76Updated 9 years ago
edwardcapriolo / hive_test
Unit test framework for hive and hive-service
☆64Updated 2 years ago
holdenk / spark-validator
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…
☆109Updated 7 years ago
googlegenomics / spark-examples
Apache Spark jobs such as Principal Coordinate Analysis.
☆75Updated 8 years ago
spotify / spark-bigquery
Google BigQuery support for Spark, SQL, and DataFrames
☆155Updated 5 years ago
tresata / spark-sorted
Secondary sort and streaming reduce for Apache Spark
☆78Updated last year
intenthq / pucket
Bucketing and partitioning system for Parquet
☆30Updated 6 years ago
hbutani / spark-datetime
functionstest
☆33Updated 8 years ago
traviscrawford / spark-dynamodb
DynamoDB data source for Apache Spark
☆95Updated 3 years ago
hohonuuli / sparknotebook
An example of running Apache Spark using Scala in ipython notebook
☆140Updated 9 years ago
coral-streaming / coral
Coral is a real-time analytics and data science platform. It transforms streaming events and extract patterns from data via RESTful APIs.…
☆146Updated 5 years ago
amazon-archives / aws-scala-sdk
It's like the AWS SDK for Java, but more Scala-y
☆73Updated 7 years ago