aws-solutions / aws-data-lake-solution
A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
☆401Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for aws-data-lake-solution
- A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.☆330Updated 7 months ago
- The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.☆557Updated this week
- The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by maki…☆199Updated last year
- A workshop demonstrating the capabilities of S3, Athena, Glue, Kinesis, and Quicksight.☆159Updated 4 years ago
- Enterprise-grade, production-hardened, serverless data lake on AWS☆419Updated this week
- A UI that simplifies testing with Amazon Kinesis Streams and Firehose. Create and save record templates, and easily send data to Amazon K…☆200Updated 3 weeks ago
- Reference Architectures for Datalakes on AWS☆79Updated 4 years ago
- Amazon Redshift Advanced Monitoring☆268Updated last year
- ☆156Updated 8 months ago
- Step Functions Data Science SDK for building machine learning (ML) workflows and pipelines on AWS☆289Updated last year
- A packaged Data Lake solution, that builds a highly functional Data Lake, with a data catalog queryable via Elasticsearch☆73Updated 3 years ago
- Data Lake as Code, featuring ChEMBL and OpenTargets☆166Updated 11 months ago
- AWS Glue Libraries are additions and enhancements to Spark for ETL operations.☆642Updated 6 months ago
- CloudFormation templates and scripts to setup the AWS services for the workshop, Athena & Redshift Spectrum queries☆175Updated 4 years ago
- Creates a CloudFormation template that uses AWS StepFunctions to automate the building and training of Sagemaker custom models based on S…☆165Updated 4 years ago
- Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the…☆238Updated last month
- An open source development framework to help you build data workflows and modern data architecture on AWS.☆256Updated this week
- AWS libraries/modules for working with Kinesis aggregated record data☆378Updated last month
- Glue scripts for converting AWS Service Logs for use in Athena☆142Updated 9 months ago
- A sample AWS Lambda function that accepts messages from an Amazon Kinesis Stream and transfers the messages to another data transport.☆288Updated 2 years ago
- Turbine: the bare metals that gets you Airflow☆377Updated 3 years ago
- Sample CloudFormation templates and architecture for AWS Service Catalog☆429Updated 6 months ago
- ☆66Updated 5 months ago
- A Data Platform built for AWS, powered by Kubernetes.☆127Updated last year
- ☆74Updated 11 months ago
- Sample Apache Flink application that can be deployed to Kinesis Analytics for Java. It reads taxi events from a Kinesis data stream, proc…☆85Updated last year
- Samples and documentation for using the Amazon Neptune graph database service☆354Updated this week
- AWS Lambda function to forward Stream data to Kinesis Firehose☆279Updated last year
- Lab Instructions for Data Engineering Immersion Day☆177Updated last week