aws/aws-emr-best-practices

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aws/aws-emr-best-practices)

aws / aws-emr-best-practices

A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.

☆110

Alternatives and similar repositories for aws-emr-best-practices

Users that are interested in aws-emr-best-practices are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

aws-samples / emr-trino-autoscale
View on GitHub
☆23Feb 14, 2025Updated last year
aws / aws-emr-containers-best-practices
View on GitHub
Best practices and recommendations for getting started with Amazon EMR on EKS.
☆70Jun 29, 2026Updated 2 weeks ago
aws-samples / aws-emr-utilities
View on GitHub
☆45Updated this week
aws-samples / emr-spark-benchmark
View on GitHub
☆26Apr 26, 2026Updated 2 months ago
awslabs / amazon-emr-vscode-toolkit
View on GitHub
A VS Code Extension to make it easier to manage and develop Spark jobs on EMR
☆39Feb 17, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
awslabs / spark-sql-kinesis-connector
View on GitHub
Spark Structured Streaming Kinesis Data Streams connector supports both GetRecords and SubscribeToShard (Enhanced Fan-Out, EFO)
☆41Updated this week
aws-samples / spark-streaming-sql-s3-connector
View on GitHub
An Apache Spark Structured Streaming S3 connector for reading S3 files using Amazon S3 event notifications to AWS SQS
☆16Feb 13, 2024Updated 2 years ago
awslabs / amazon-emr-user-role-mapper
View on GitHub
Application to securely map users on a multi tenant Amazon EMR cluster to different IAM Roles and then assume the mapped Role.
☆24Oct 24, 2023Updated 2 years ago
aws-samples / dbt-glue
View on GitHub
This repository contains the dbt-glue adapter
☆144Jul 2, 2026Updated last week
aws-samples / emr-studio-notebook-examples
View on GitHub
This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.
☆53Oct 31, 2023Updated 2 years ago
aws-samples / aws-emr-apache-ranger
View on GitHub
☆24Oct 3, 2023Updated 2 years ago
garystafford / athena-glue-quicksight-demo
View on GitHub
Source code for the post, 'Getting Started with Data Analysis on AWS, using S3, Glue, Amazon Athena, and QuickSight'
☆29Dec 22, 2020Updated 5 years ago
awslabs / aws-glue-catalog-sync-agent-for-hive
View on GitHub
Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog
☆37Dec 5, 2023Updated 2 years ago
awsdocs / amazon-emr-management-guide
View on GitHub
The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…
☆62Jun 15, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
aws-samples / emr-remote-shuffle-service
View on GitHub
☆18May 7, 2026Updated 2 months ago
awslabs / aws-glue-streaming-libs
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
awslabs / amazon-emr-on-eks-custom-image-cli
View on GitHub
Amazon EMR on EKS Custom Image CLI
☆32Sep 26, 2024Updated last year
aws-samples / aws-analytics-reference-architecture
View on GitHub
☆157Feb 29, 2024Updated 2 years ago
aws-samples / emr-on-eks-benchmark
View on GitHub
☆32Jul 2, 2026Updated last week
aws-samples / aws-emr-advisor
View on GitHub
EMR Advisor uses Spark Event Logs to generate insights and costs/runtime recommendations using different deployment options for Amazon EM…
☆17Jun 5, 2025Updated last year
rhythmictech / terraform-aws-secretsmanager-secret
View on GitHub
Simple secret module for AWS Secrets Manager
☆10Aug 16, 2022Updated 3 years ago
aws / Unified-Studio-for-Amazon-Sagemaker
View on GitHub
☆56Apr 21, 2026Updated 2 months ago
radzionc / deploy-notebook
View on GitHub
Deploy Jupyter Notebook to AWS Lambda
☆16Nov 18, 2020Updated 5 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
aws-samples / aws-organized
View on GitHub
☆22Oct 18, 2023Updated 2 years ago
garystafford / kinesis-redshift-streaming-demo
View on GitHub
☆17Oct 15, 2020Updated 5 years ago
kaloureyes3 / v4-clients
View on GitHub
☆10Apr 5, 2024Updated 2 years ago
bluishglc / emr-edgenode-maker
View on GitHub
This tool can easily make / build an emr cluster edge node / client node / gateway node
☆10Jun 1, 2022Updated 4 years ago
cloudera / dbt-spark-livy
View on GitHub
The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy
☆12Mar 30, 2023Updated 3 years ago
aws-samples / amazon-redshift-infrastructure-automation
View on GitHub
☆27Aug 8, 2024Updated last year
aws-samples / aws-glue-streaming-etl-with-apache-iceberg
View on GitHub
Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3
☆27Sep 10, 2024Updated last year
aws-samples / redshift-roles
View on GitHub
☆24Sep 3, 2024Updated last year
aws-samples / assignment-automation-4-aws-sso
View on GitHub
This sample repositories provides an production ready example of enhancing AWS SSO for enterprise usage. We provide an automation for ass…
☆16Jun 8, 2026Updated last month
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
aws-samples / emr-serverless-samples
View on GitHub
Example code for running Spark and Hive jobs on EMR Serverless.
☆170Jul 8, 2026Updated last week
JerryLead / SparkProfiler
View on GitHub
Profiling Spark Applications for Performance Comparison and Diagnosis
☆16Nov 11, 2018Updated 7 years ago
tinyclues / spark-glue-data-catalog
View on GitHub
Apache Spark build compatible with AWS Glue Data Catalog.
☆19Aug 9, 2021Updated 4 years ago
aws-samples / aws-serverless-realtime-aggregation
View on GitHub
Set up a near-real-time, scalable, serverless data aggregation pipeline in the AWS Cloud with Amazon DynamoDB, AWS Lambda, and Amazon Kin…
☆18Feb 9, 2022Updated 4 years ago
aws / awesome-redshift
View on GitHub
☆75Jun 8, 2023Updated 3 years ago
spark-redshift-community / spark-redshift
View on GitHub
Performant Redshift data source for Apache Spark
☆140Jun 5, 2026Updated last month
aws-solutions-library-samples / guidance-for-advanced-multi-modal-chatbot-including-speech-to-speech-on-aws
View on GitHub
This GenAI solution enables users to extract insights from diverse data formats (video, audio, PDFs, text) through a unified interface. U…
☆22Feb 12, 2026Updated 5 months ago