BigDataBoutique / presto-cloud-deployLinks
Deploy Presto on the cloud easily, using Terraform and Packer
☆45Updated 2 years ago
Alternatives and similar repositories for presto-cloud-deploy
Users that are interested in presto-cloud-deploy are comparing it to the libraries listed below
Sorting:
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- Mirrors a Kinesis stream to Amazon S3 using the KCL☆42Updated 8 months ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆45Updated 2 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- Highly configurable Helm Presto Chart☆24Updated 5 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆13Updated 2 years ago
- The SQS connector plugin provides the ability to use AWS SQS queues as both a source (from an SQS queue into a Kafka topic) or sink (out …☆77Updated 6 months ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆64Updated last year
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆75Updated 6 years ago
- Automatically loads new partitions in AWS Athena☆19Updated 4 years ago
- Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS☆74Updated 3 months ago
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Updated last year
- A library for reading data from Amzon S3 with optimised listing using Amazon SQS using Spark SQL Streaming ( or Structured streaming).☆17Updated last year
- Web UI for Amazon Athena☆56Updated 2 years ago
- Open Source Secret Provider plugin for the Kafka Connect framework☆46Updated 10 months ago
- Kinesis Connector for Structured Streaming☆136Updated 10 months ago
- A testing framework for Trino☆26Updated 2 months ago
- Kafka Configuration Provider for AWS Secrets Manager☆23Updated 2 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated last month
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆51Updated last year
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆20Updated 5 years ago
- Spark Scala docker container sample for AWS testing - EKS & S3☆24Updated 6 years ago
- Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.☆21Updated 6 months ago
- ⚠️ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.☆41Updated 4 months ago
- ☆45Updated 7 years ago
- Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs☆20Updated this week
- Autoscaling EMR clusters and Kinesis streams on Amazon Web Services (AWS)☆47Updated last year
- Paper: A Zero-rename committer for object stores☆20Updated 4 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆68Updated 3 months ago