aws-samples / analysing-realtime-streaming-data-using-msk-emrLinks
☆14Updated 5 years ago
Alternatives and similar repositories for analysing-realtime-streaming-data-using-msk-emr
Users that are interested in analysing-realtime-streaming-data-using-msk-emr are comparing it to the libraries listed below
Sorting:
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆65Updated 2 years ago
- A solution describing data-processing design pattern for streaming data through Kinesis and Spark Streaming at real-time.☆38Updated last year
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 3 years ago
- Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.☆72Updated last year
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 3 years ago
- Sample Apache Flink application that can be deployed to Kinesis Analytics for Java. It reads taxi events from a Kinesis data stream, proc…☆86Updated 2 years ago
- Learn how to build an end-to-end streaming architecture to ingest, analyze, and visualize streaming data in near real-time☆34Updated 3 years ago
- ☆13Updated 2 years ago
- Example applications in Java, Python and SQL for Kinesis Data Analytics, demonstrating sources, sinks, and operators.☆146Updated last year
- Reference Architectures for Datalakes on AWS☆78Updated 5 years ago
- As customers move from building data lakes and analytics on AWS to building machine learning solutions, one of their biggest challenges i…☆63Updated 6 years ago
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Updated 2 years ago
- This repository shows a sample example to build, manage and orchestrate Machine Learning workflows using Amazon Sagemaker and Apache Airf…☆138Updated 4 years ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆77Updated 6 years ago
- Sample Apache Beam pipeline that can be deployed to Amazon Managed Service for Apache Flink. It reads taxi events from a Kinesis data str…☆47Updated last year
- A hybrid Big Data pipeline architecture that combines a real-time streaming layer with a batch layer to process large datasets(Lambda Arc…☆184Updated last month
- ☆21Updated 3 months ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆168Updated 9 months ago
- ☆89Updated last year
- Automated data quality suggestions and analysis with Deequ on AWS Glue☆88Updated 2 years ago
- A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational e…☆109Updated 2 weeks ago
- A Java application that replays events that are stored in objects in Amazon S3 into a Amazon Kinesis stream as if they occurred in real t…☆51Updated 9 months ago
- ☆73Updated last year
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆52Updated last year
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Updated 2 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 9 months ago
- Open innovation with 60 minute cloud experiments on AWS☆88Updated last year
- Lab Instructions for Data Engineering Immersion Day☆192Updated 8 months ago
- ☆32Updated last year