aws / aws-emr-best-practicesView external linksLinks
A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.
☆110Feb 2, 2026Updated last week
Alternatives and similar repositories for aws-emr-best-practices
Users that are interested in aws-emr-best-practices are comparing it to the libraries listed below
Sorting:
- Best practices and recommendations for getting started with Amazon EMR on EKS.☆67Jan 27, 2026Updated 2 weeks ago
- ☆42Jan 9, 2026Updated last month
- ☆23Feb 14, 2025Updated last year
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆39Feb 17, 2025Updated 11 months ago
- This repository contains the dbt-glue adapter☆141Jan 6, 2026Updated last month
- Source code for the post, 'Getting Started with Data Analysis on AWS, using S3, Glue, Amazon Athena, and QuickSight'☆29Dec 22, 2020Updated 5 years ago
- ☆24Oct 3, 2023Updated 2 years ago
- ☆13Feb 26, 2024Updated last year
- Application to securely map users on a multi tenant Amazon EMR cluster to different IAM Roles and then assume the mapped Role.☆24Oct 24, 2023Updated 2 years ago
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆52Oct 31, 2023Updated 2 years ago
- An Apache Spark Structured Streaming S3 connector for reading S3 files using Amazon S3 event notifications to AWS SQS☆15Feb 13, 2024Updated 2 years ago
- This tool can easily make / build an emr cluster edge node / client node / gateway node☆10Jun 1, 2022Updated 3 years ago
- ☆18Nov 4, 2024Updated last year
- ☆32Jan 30, 2026Updated 2 weeks ago
- The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy☆12Mar 30, 2023Updated 2 years ago
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆35Dec 5, 2023Updated 2 years ago
- ☆22Nov 4, 2025Updated 3 months ago
- Spark Structured Streaming Kinesis Data Streams connector supports both GetRecords and SubscribeToShard (Enhanced Fan-Out, EFO)☆39Jan 30, 2026Updated 2 weeks ago
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Jun 15, 2023Updated 2 years ago
- Deploy Jupyter Notebook to AWS Lambda☆16Nov 18, 2020Updated 5 years ago
- This solution combines Amazon Pinpoint with Amazon SageMaker to help automate the process of collecting customer data, predicting custom…☆17Dec 17, 2020Updated 5 years ago
- OpenAI-Compatible RESTful APIs for Amazon Bedrock, modified from the original "bedrock-access-gateway" project for not using ALB, so that…☆18Jan 31, 2026Updated 2 weeks ago
- Mirror of Apache Ranger☆15Apr 5, 2024Updated last year
- ☆25Jul 4, 2023Updated 2 years ago
- ☆20Mar 6, 2025Updated 11 months ago
- ☆23Sep 3, 2024Updated last year
- Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR☆17Apr 27, 2025Updated 9 months ago
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆25Sep 10, 2024Updated last year
- Transcribe news audio in realtime☆24Sep 4, 2023Updated 2 years ago
- ☆22Oct 18, 2023Updated 2 years ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆168Jan 8, 2025Updated last year
- The open source version of the Amazon EMR Release Guide. You can submit feedback & requests for changes by submitting issues in this repo…☆29Jun 15, 2023Updated 2 years ago
- Build and automatize the management of your Sagemaker Studio Users using AWS CDK!☆23Apr 5, 2022Updated 3 years ago
- A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficie…☆248Feb 7, 2026Updated last week
- aws-solutions-library-samples / guidance-for-natural-language-queries-of-relational-databases-on-awsDemonstration of Natural Language Query (NLQ) of an Amazon RDS for PostgreSQL database, using SageMaker JumpStart, Amazon Bedrock, LangCh…☆71Oct 19, 2024Updated last year
- Scripts and instructions to facilitate running Deep Learning Tasks on Amazon EMR☆62Nov 9, 2023Updated 2 years ago
- This Guidance demonstrates how you can automate your carbon footprint tracking with the Sustainability Insights Framework (SIF) on AWS☆29Oct 20, 2024Updated last year
- ☆27Aug 8, 2024Updated last year
- Prometheus Exporter for Cloudera Cluster status and usage metrics☆24Mar 30, 2023Updated 2 years ago