A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.
☆110Apr 5, 2026Updated last month
Alternatives and similar repositories for aws-emr-best-practices
Users that are interested in aws-emr-best-practices are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆45Apr 30, 2026Updated last week
- Best practices and recommendations for getting started with Amazon EMR on EKS.☆69May 3, 2026Updated last week
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆39Feb 17, 2025Updated last year
- Spark Structured Streaming Kinesis Data Streams connector supports both GetRecords and SubscribeToShard (Enhanced Fan-Out, EFO)☆39Updated this week
- An Apache Spark Structured Streaming S3 connector for reading S3 files using Amazon S3 event notifications to AWS SQS☆15Feb 13, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Application to securely map users on a multi tenant Amazon EMR cluster to different IAM Roles and then assume the mapped Role.☆24Oct 24, 2023Updated 2 years ago
- This repository contains the dbt-glue adapter☆143May 1, 2026Updated last week
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆53Oct 31, 2023Updated 2 years ago
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆35Dec 5, 2023Updated 2 years ago
- ☆18Nov 4, 2024Updated last year
- ☆14Feb 26, 2024Updated 2 years ago
- ☆157Feb 29, 2024Updated 2 years ago
- Amazon EMR on EKS Custom Image CLI☆32Sep 26, 2024Updated last year
- Simple secret module for AWS Secrets Manager☆10Aug 16, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Analyzing NBA Data☆11Feb 19, 2015Updated 11 years ago
- Deploy Jupyter Notebook to AWS Lambda☆16Nov 18, 2020Updated 5 years ago
- ☆55Apr 21, 2026Updated 2 weeks ago
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Jun 15, 2023Updated 2 years ago
- ☆22Oct 18, 2023Updated 2 years ago
- ☆21Updated this week
- This tool can easily make / build an emr cluster edge node / client node / gateway node☆10Jun 1, 2022Updated 3 years ago
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆26Sep 10, 2024Updated last year
- The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy☆12Mar 30, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆27Aug 8, 2024Updated last year
- A simple crypto algotrader for RobinHood☆16Jan 30, 2021Updated 5 years ago
- This solution combines Amazon Pinpoint with Amazon SageMaker to help automate the process of collecting customer data, predicting custom…☆17Dec 17, 2020Updated 5 years ago
- ☆23Sep 3, 2024Updated last year
- Public README☆13Aug 2, 2025Updated 9 months ago
- ☆12May 18, 2019Updated 6 years ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆169Updated this week
- ☆25Jul 4, 2023Updated 2 years ago
- ☆75Jun 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Apache Spark build compatible with AWS Glue Data Catalog.☆19Aug 9, 2021Updated 4 years ago
- Keeping your infrastructure clean since 2018☆12Mar 14, 2024Updated 2 years ago
- ☆10Apr 5, 2024Updated 2 years ago
- Performant Redshift data source for Apache Spark☆140Mar 17, 2026Updated last month
- aws-solutions-library-samples / guidance-for-natural-language-queries-of-relational-databases-on-awsDemonstration of Natural Language Query (NLQ) of an Amazon RDS for PostgreSQL database, using SageMaker JumpStart, Amazon Bedrock, LangCh…☆72Oct 19, 2024Updated last year
- ☆10Dec 13, 2023Updated 2 years ago
- This package contains the grammar in ANTLR g4 format and Java parser for the Data Quality Definition Language (DQDL), used by AWS Glue Da…☆22May 1, 2026Updated last week