aws-solutions / aws-data-lake-solution
A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
☆401Updated 8 months ago
Alternatives and similar repositories for aws-data-lake-solution:
Users that are interested in aws-data-lake-solution are comparing it to the libraries listed below
- A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.☆335Updated 10 months ago
- The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by maki…☆198Updated last year
- Enterprise-grade, production-hardened, serverless data lake on AWS☆431Updated last week
- CloudFormation templates and scripts to setup the AWS services for the workshop, Athena & Redshift Spectrum queries☆175Updated 4 years ago
- A UI that simplifies testing with Amazon Kinesis Streams and Firehose. Create and save record templates, and easily send data to Amazon K…☆202Updated 4 months ago
- A workshop demonstrating the capabilities of S3, Athena, Glue, Kinesis, and Quicksight.☆158Updated 4 years ago
- Amazon Redshift Advanced Monitoring☆271Updated last year
- An open source development framework to help you build data workflows and modern data architecture on AWS.☆261Updated 3 weeks ago
- AWS Glue Libraries are additions and enhancements to Spark for ETL operations.☆663Updated 9 months ago
- A packaged Data Lake solution, that builds a highly functional Data Lake, with a data catalog queryable via Elasticsearch☆73Updated 4 years ago
- ☆158Updated 11 months ago
- Reference Architectures for Datalakes on AWS☆79Updated 4 years ago
- ☆52Updated 7 years ago
- Data Lake as Code, featuring ChEMBL and OpenTargets☆169Updated last year
- Glue scripts for converting AWS Service Logs for use in Athena☆142Updated last year
- Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the…☆242Updated last week
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆76Updated 6 years ago
- Lab Instructions for Data Engineering Immersion Day☆183Updated this week
- Turbine: the bare metals that gets you Airflow☆377Updated 3 years ago
- Repository for AWS Glue Workshop☆31Updated 2 years ago
- This repository hosts sample pipelines☆465Updated 4 years ago
- Creates a CloudFormation template that uses AWS StepFunctions to automate the building and training of Sagemaker custom models based on S…☆165Updated 5 years ago
- AWS Lambda function to forward Stream data to Kinesis Firehose☆279Updated last year
- ☆67Updated 8 months ago
- AWS libraries/modules for working with Kinesis aggregated record data☆377Updated 4 months ago
- A sample AWS Lambda function that accepts messages from an Amazon Kinesis Stream and transfers the messages to another data transport.☆288Updated 2 years ago
- Amazon Redshift Database Loader implemented in AWS Lambda☆595Updated 7 months ago
- ☆73Updated last year
- As customers move from building data lakes and analytics on AWS to building machine learning solutions, one of their biggest challenges i…☆62Updated 6 years ago
- A simple JavaScript frontend and SAM template to spin up a serverless backend, federating Cognito User Pools users to QuickSight.☆79Updated 3 years ago