A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.
☆110Apr 5, 2026Updated last week
Alternatives and similar repositories for aws-emr-best-practices
Users that are interested in aws-emr-best-practices are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆23Feb 14, 2025Updated last year
- ☆45Apr 4, 2026Updated 2 weeks ago
- Best practices and recommendations for getting started with Amazon EMR on EKS.☆68Jan 27, 2026Updated 2 months ago
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆39Feb 17, 2025Updated last year
- An Apache Spark Structured Streaming S3 connector for reading S3 files using Amazon S3 event notifications to AWS SQS☆15Feb 13, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Application to securely map users on a multi tenant Amazon EMR cluster to different IAM Roles and then assume the mapped Role.☆24Oct 24, 2023Updated 2 years ago
- This repository contains the dbt-glue adapter☆143Mar 31, 2026Updated 2 weeks ago
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆53Oct 31, 2023Updated 2 years ago
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆35Dec 5, 2023Updated 2 years ago
- ☆18Nov 4, 2024Updated last year
- ☆14Feb 26, 2024Updated 2 years ago
- ☆157Feb 29, 2024Updated 2 years ago
- ☆32Jan 30, 2026Updated 2 months ago
- Amazon EMR on EKS Custom Image CLI☆32Sep 26, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Simple secret module for AWS Secrets Manager☆10Aug 16, 2022Updated 3 years ago
- Analyzing NBA Data☆11Feb 19, 2015Updated 11 years ago
- ☆56Updated this week
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Jun 15, 2023Updated 2 years ago
- A Apache Hive SerDe (short for serializer/deserializer) for the Ion file format.☆31Mar 27, 2025Updated last year
- ☆22Oct 18, 2023Updated 2 years ago
- spark connector for Milvus☆16Updated this week
- ☆17Oct 15, 2020Updated 5 years ago
- This tool can easily make / build an emr cluster edge node / client node / gateway node☆10Jun 1, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆26Sep 10, 2024Updated last year
- Common modules to help set up the Snowflake CI/CD using flyway and Azure DevOps pipeline.☆20May 3, 2022Updated 3 years ago
- The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy☆12Mar 30, 2023Updated 3 years ago
- This solution combines Amazon Pinpoint with Amazon SageMaker to help automate the process of collecting customer data, predicting custom…☆17Dec 17, 2020Updated 5 years ago
- ☆23Sep 3, 2024Updated last year
- Scripts and instructions to facilitate running Deep Learning Tasks on Amazon EMR☆63Nov 9, 2023Updated 2 years ago
- Public README☆13Aug 2, 2025Updated 8 months ago
- ☆17Dec 31, 2025Updated 3 months ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆169Mar 11, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Profiling Spark Applications for Performance Comparison and Diagnosis☆17Nov 11, 2018Updated 7 years ago
- Helper tool for migrating from Vaadin Framework 7 to 8☆10Aug 4, 2022Updated 3 years ago
- ☆75Jun 8, 2023Updated 2 years ago
- Apache Spark build compatible with AWS Glue Data Catalog.☆19Aug 9, 2021Updated 4 years ago
- ☆10Apr 5, 2024Updated 2 years ago
- This GenAI solution enables users to extract insights from diverse data formats (video, audio, PDFs, text) through a unified interface. U…☆17Feb 12, 2026Updated 2 months ago
- Performant Redshift data source for Apache Spark☆140Mar 17, 2026Updated last month