zillow / aws-custom-credential-providerLinks

A custom AWS credential provider that allows your Hadoop or Spark application access S3 file system by assuming a role

☆10

Alternatives and similar repositories for aws-custom-credential-provider

Users that are interested in aws-custom-credential-provider are comparing it to the libraries listed below

Sorting:

ExpediaGroup / circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆91Updated last year
CoxAutomotiveDataSolutions / waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
☆76Updated last year
KeithSSmith / spark-compaction
File compaction tool that runs on top of the Spark framework.
☆59Updated 6 years ago
rdblue / s3committer
Hadoop output committers for S3
☆111Updated 5 years ago
aws-samples / flink-stream-processing-refarch
Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.
☆70Updated last year
SponsorPay / jaquet
Spark stream from kafka(json) to s3(parquet)
☆15Updated 7 years ago
zalando-incubator / spark-json-schema
JSON schema parser for Apache Spark
☆82Updated 3 years ago
swoop-inc / spark-records
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
☆73Updated 4 years ago
hortonworks-spark / spark-schema-registry
Schema Registry integration for Apache Spark
☆40Updated 3 years ago
audienceproject / spark-dynamodb
Plug-and-play implementation of an Apache Spark custom data source for AWS DynamoDB.
☆176Updated 4 years ago
hammerlab / yarn-logs-helpers
Scripts for parsing / making sense of yarn logs
☆52Updated 9 years ago
hortonworks-spark / cloud-integration
Spark cloud integration: tests, cloud committers and more
☆20Updated 10 months ago
qubole / kinesis-sql
Kinesis Connector for Structured Streaming
☆137Updated last year
snowplow-archive / spark-example-project
A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
☆119Updated 9 years ago
FelixNeutatz / parquet-flinktacular
How to use Parquet in Flink
☆32Updated 8 years ago
qubole / spark-acid
ACID Data Source for Apache Spark based on Hive ACID
☆97Updated 4 years ago
awslabs / aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…
☆226Updated 8 months ago
avensolutions / cdc-at-scale-using-spark
Scalable CDC Pattern Implemented using PySpark
☆18Updated last month
awslabs / aws-glue-catalog-sync-agent-for-hive
Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog
☆35Updated 2 years ago
ExpediaGroup / shunting-yard
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Updated 4 years ago
bartosz25 / spark-scala-playground
Sample processing code using Spark 2.1+ and Scala
☆51Updated 5 years ago
pluralsight / hydra-spark
☆50Updated 5 years ago
nerdammer / spark-additions
Utilities for Apache Spark
☆34Updated 9 years ago
miguno / avro-cli-examples
Examples on how to use the command line tools in Avro Tools to read and write Avro files
☆153Updated last year
zheyuan28 / SparkTaskMetrics
Task Metrics Explorer
☆14Updated 6 years ago
traviscrawford / spark-dynamodb
DynamoDB data source for Apache Spark
☆95Updated 4 years ago
awslabs / kinesis-kafka-connector
kinesis-kafka-connector is connector based on Kafka Connect to publish messages to Amazon Kinesis streams or Amazon Kinesis Firehose.
☆158Updated 2 years ago
Talend / beam-samples
☆81Updated 2 years ago
japila-books / delta-lake-internals
The Internals of Delta Lake
☆187Updated this week
HeartSaVioR / spark-state-tools
Spark Structured Streaming State Tools
☆34Updated 5 years ago