aws-solutions / aws-data-lake-solution
A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
☆401Updated 11 months ago
Alternatives and similar repositories for aws-data-lake-solution:
Users that are interested in aws-data-lake-solution are comparing it to the libraries listed below
- A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.☆340Updated last year
- A workshop demonstrating the capabilities of S3, Athena, Glue, Kinesis, and Quicksight.☆157Updated 5 years ago
- Enterprise-grade, production-hardened, serverless data lake on AWS☆449Updated last month
- The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by maki…☆198Updated last year
- A UI that simplifies testing with Amazon Kinesis Streams and Firehose. Create and save record templates, and easily send data to Amazon K…☆206Updated 6 months ago
- Data Lake as Code, featuring ChEMBL and OpenTargets☆170Updated last year
- A packaged Data Lake solution, that builds a highly functional Data Lake, with a data catalog queryable via Elasticsearch☆73Updated 4 years ago
- CloudFormation templates and scripts to setup the AWS services for the workshop, Athena & Redshift Spectrum queries☆175Updated 5 years ago
- Amazon Redshift Advanced Monitoring☆272Updated 2 years ago
- An open source development framework to help you build data workflows and modern data architecture on AWS.☆263Updated 2 weeks ago
- ☆158Updated last year
- The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.☆576Updated this week
- ☆74Updated last year
- AWS Glue Libraries are additions and enhancements to Spark for ETL operations.☆671Updated last year
- This GitHub project provides a series of lab exercises which help users get started using the Redshift platform.☆53Updated 4 years ago
- Lab Instructions for Data Engineering Immersion Day☆189Updated 2 months ago
- Glue scripts for converting AWS Service Logs for use in Athena☆141Updated last year
- A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficie…☆242Updated this week
- Creates a CloudFormation template that uses AWS StepFunctions to automate the building and training of Sagemaker custom models based on S…☆165Updated 5 years ago
- Continuously monitors a set of log files and sends new data to the Amazon Kinesis Stream and Amazon Kinesis Firehose in near-real-time.☆369Updated 2 months ago
- Step Functions Data Science SDK for building machine learning (ML) workflows and pipelines on AWS☆290Updated 3 weeks ago
- Reference Architectures for Datalakes on AWS☆79Updated 4 years ago
- Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the…☆243Updated last week
- Sample Apache Flink application that can be deployed to Kinesis Analytics for Java. It reads taxi events from a Kinesis data stream, proc…☆85Updated last year
- ☆50Updated 2 years ago
- Sample code for dynamically managing RDS/RDBMS connections across a fleet of Lambda functions☆236Updated 6 years ago
- A solutions that automatically configures the AWS services necessary to easily capture, store, process, and deliver streaming data. This …☆93Updated 3 months ago
- ☆27Updated 4 years ago
- Design pattern for orchestrating an incremental data ingestion pipeline using AWS Step Functions from an on premise location into an Amaz…☆28Updated 5 years ago
- Workshop and lab content for Amazon Aurora MySQL compatible databases. This code will contain a series of templates, instructional guides…☆79Updated last year