gwenshap / lambda_s3_kafkaLinks
AWS Lambda function to get events in Kafka topic when files are uploaded to S3
☆24Updated 7 years ago
Alternatives and similar repositories for lambda_s3_kafka
Users that are interested in lambda_s3_kafka are comparing it to the libraries listed below
Sorting:
- Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker…☆84Updated 2 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 9 months ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 3 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated 2 years ago
- ☆61Updated last year
- Example applications in Java, Python and SQL for Kinesis Data Analytics, demonstrating sources, sinks, and operators.☆146Updated last year
- Airflow Unit Tests and Integration Tests☆261Updated 2 years ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 3 years ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆77Updated 7 years ago
- Export Redshift data and convert to Parquet for use with Redshift Spectrum or other data warehouses.☆117Updated 2 years ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆65Updated 2 years ago
- ☆202Updated 2 years ago
- Benchmark data warehouses under Fivetran-like conditions☆171Updated 2 years ago
- Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks☆161Updated last year
- A hybrid Big Data pipeline architecture that combines a real-time streaming layer with a batch layer to process large datasets(Lambda Arc…☆184Updated last month
- Automated data quality suggestions and analysis with Deequ on AWS Glue☆88Updated 2 years ago
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆175Updated 5 months ago
- A pyspark lib to validate data quality☆18Updated 2 years ago
- Performant Redshift data source for Apache Spark☆140Updated 2 weeks ago
- This repository contains a sample project that can be used to start off your own source connector for Kafka Connect.☆37Updated 2 years ago
- Use Airflow to move data from multiple MySQL databases to BigQuery☆100Updated 5 years ago
- Supporting repository for the blog post at https://medium.com/@stephane.maarek/how-to-use-apache-kafka-to-transform-a-batch-pipeline-into…☆245Updated last year
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- Example DAGs using hooks and operators from Airflow Plugins☆347Updated 7 years ago
- A Getting Started Guide for developing and using Airflow Plugins☆93Updated 6 years ago
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆181Updated 2 months ago
- Airflow training for the crunch conf☆104Updated 7 years ago
- Spark runtime on AWS Lambda☆111Updated 2 months ago
- A guide to running Airflow on Kubernetes☆173Updated 6 years ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 4 years ago