This repository hold the Amazon Elastic MapReduce sample bootstrap actions
☆613Jun 5, 2023Updated 2 years ago
Alternatives and similar repositories for emr-bootstrap-actions
Users that are interested in emr-bootstrap-actions are comparing it to the libraries listed below
Sorting:
- Amazon Elastic MapReduce code samples☆63Sep 8, 2015Updated 10 years ago
- ☆895Jul 15, 2022Updated 3 years ago
- A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR☆120Mar 28, 2016Updated 9 years ago
- Amazon Redshift Database Loader implemented in AWS Lambda☆596Jul 16, 2024Updated last year
- Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment☆2,811Sep 3, 2025Updated 6 months ago
- This repository hosts sample pipelines☆470May 8, 2020Updated 5 years ago
- functionstest☆33Oct 25, 2016Updated 9 years ago
- Apache Spark on AWS Lambda☆157Dec 5, 2022Updated 3 years ago
- Amazon Redshift Advanced Monitoring☆272Oct 28, 2025Updated 4 months ago
- REST job server for Apache Spark☆2,842Jul 8, 2025Updated 7 months ago
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Jun 15, 2023Updated 2 years ago
- ☆328Mar 18, 2021Updated 4 years ago
- A toolset to streamline running spark python on EMR☆20Nov 16, 2016Updated 9 years ago
- A collection of example UDFs for Amazon Redshift.☆244Oct 25, 2024Updated last year
- Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB☆228Jan 15, 2026Updated last month
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago
- A Spark Streaming job reading events from Amazon Kinesis and writing event counts to DynamoDB☆93Oct 1, 2020Updated 5 years ago
- Base classes to use when writing tests with Spark☆1,549Dec 22, 2025Updated 2 months ago
- Redshift Python library for user agent detection (browsers, devices, etc) and parsing via UDFs☆10May 27, 2020Updated 5 years ago
- ☆54Oct 3, 2023Updated 2 years ago
- pysh-db - The Data Science Toolkit (DSK)☆13Dec 5, 2018Updated 7 years ago
- Jupyter magics and kernels for working with remote Spark clusters☆1,362Sep 9, 2025Updated 5 months ago
- Amazon Kinesis Client Library for Python☆376Dec 10, 2025Updated 2 months ago
- Mirror of Apache Toree (Incubating)☆749Feb 21, 2026Updated last week
- Interactive and Reactive Data Science using Scala and Spark.☆3,150May 16, 2023Updated 2 years ago
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Feb 13, 2020Updated 6 years ago
- Livy is an open source REST interface for interacting with Apache Spark from anywhere☆1,007Oct 5, 2022Updated 3 years ago
- Deploy Spark cluster in an easy way.☆75Sep 13, 2016Updated 9 years ago
- CLI tool to launch Spark jobs on AWS EMR☆67Oct 18, 2023Updated 2 years ago
- ☆762Mar 11, 2021Updated 4 years ago
- AWS Glue code samples☆1,536Nov 5, 2025Updated 3 months ago
- A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…☆108Feb 1, 2018Updated 8 years ago
- Locality Sensitive Hashing for Apache Spark☆197Nov 1, 2016Updated 9 years ago
- A command-line tool for launching Apache Spark clusters.☆651Dec 13, 2024Updated last year
- DynamoDB data source for Apache Spark☆95Sep 2, 2021Updated 4 years ago
- ☆110Apr 17, 2017Updated 8 years ago
- Client library for Amazon Kinesis☆660Feb 23, 2026Updated last week
- Hadoop output committers for S3☆113Jul 9, 2020Updated 5 years ago
- ☆19Jul 11, 2023Updated 2 years ago