gwenshap / lambda_s3_kafkaLinks
AWS Lambda function to get events in Kafka topic when files are uploaded to S3
☆24Updated 7 years ago
Alternatives and similar repositories for lambda_s3_kafka
Users that are interested in lambda_s3_kafka are comparing it to the libraries listed below
Sorting:
- A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational e…☆109Updated 3 months ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆77Updated 7 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated last year
- Performant Redshift data source for Apache Spark☆141Updated this week
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated 2 years ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 3 years ago
- Example applications in Java, Python and SQL for Kinesis Data Analytics, demonstrating sources, sinks, and operators.☆147Updated last year
- Automated data quality suggestions and analysis with Deequ on AWS Glue☆90Updated 3 years ago
- ☆65Updated last year
- Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker…☆84Updated 3 years ago
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆175Updated 7 months ago
- Example DAGs using hooks and operators from Airflow Plugins☆348Updated 7 years ago
- Experiments and demonstrations of AVRO, Protobuf serialisation☆61Updated 3 years ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 4 years ago
- This repository contains a sample project that can be used to start off your own source connector for Kafka Connect.☆35Updated 3 years ago
- A pyspark lib to validate data quality☆18Updated 3 years ago
- Fully reproducible, Dockerized, step-by-step, demo on how to stream tables from Postgres to Kafka/KSQL back to Postgres. Detailed blog p…☆152Updated 4 years ago
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Updated 2 years ago
- Spark runtime on AWS Lambda☆113Updated 4 months ago
- Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks☆162Updated last year
- Export Redshift data and convert to Parquet for use with Redshift Spectrum or other data warehouses.☆117Updated 3 years ago
- Apache Spark on AWS Lambda☆157Updated 3 years ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆65Updated 2 years ago
- kinesis-kafka-connector is connector based on Kafka Connect to publish messages to Amazon Kinesis streams or Amazon Kinesis Firehose.☆158Updated 2 years ago
- Supporting repository for the blog post at https://medium.com/@stephane.maarek/how-to-use-apache-kafka-to-transform-a-batch-pipeline-into…☆247Updated 2 years ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆168Updated last year
- This is the documentation for the Amazon Redshift Developer Guide☆121Updated 2 years ago
- Benchmark data warehouses under Fivetran-like conditions☆172Updated 3 years ago
- PySpark phonetic and string matching algorithms☆41Updated last year
- JSON schema parser for Apache Spark☆82Updated 3 years ago