gwenshap / lambda_s3_kafkaLinks
AWS Lambda function to get events in Kafka topic when files are uploaded to S3
☆24Updated 7 years ago
Alternatives and similar repositories for lambda_s3_kafka
Users that are interested in lambda_s3_kafka are comparing it to the libraries listed below
Sorting:
- Performant Redshift data source for Apache Spark☆142Updated 2 months ago
- Airflow Unit Tests and Integration Tests☆260Updated 2 years ago
- A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational e…☆108Updated 2 months ago
- Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks☆161Updated 10 months ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- ☆201Updated last year
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆77Updated 6 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 7 months ago
- Fully reproducible, Dockerized, step-by-step, demo on how to stream tables from Postgres to Kafka/KSQL back to Postgres. Detailed blog p…☆152Updated 3 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- ☆247Updated 5 years ago
- Benchmark data warehouses under Fivetran-like conditions☆170Updated 2 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated 2 years ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 3 years ago
- Use Airflow to move data from multiple MySQL databases to BigQuery☆100Updated 5 years ago
- ☆59Updated last year
- Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker…☆84Updated 2 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago
- A guide to running Airflow on Kubernetes☆173Updated 6 years ago
- Example DAGs using hooks and operators from Airflow Plugins☆347Updated 7 years ago
- A Getting Started Guide for developing and using Airflow Plugins☆93Updated 6 years ago
- Snowflake Data Source for Apache Spark.☆229Updated last week
- Flowchart for debugging Spark applications☆107Updated 11 months ago
- Data ingestion library for Amundsen to build graph and search index☆204Updated last year
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆173Updated last week
- The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by maki…☆199Updated 2 years ago
- A full big data pipeline (Lambda Architecture) with Spark, Kafka, HDFS and Cassandra.☆180Updated 2 months ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆112Updated last month
- Magic to help Spark pipelines upgrade☆34Updated 11 months ago
- Multiple node presto cluster on docker container☆125Updated 3 years ago