AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
☆699Jan 13, 2026Updated 3 months ago
Alternatives and similar repositories for aws-glue-libs
Users that are interested in aws-glue-libs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AWS Glue code samples☆1,530Nov 5, 2025Updated 5 months ago
- The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by maki…☆201Jun 15, 2023Updated 2 years ago
- ☆14Feb 26, 2024Updated 2 years ago
- pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoD…☆4,105Apr 14, 2026Updated last week
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…☆228Mar 19, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.☆344Mar 29, 2024Updated 2 years ago
- Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment☆2,810Sep 3, 2025Updated 7 months ago
- Glue scripts for converting AWS Service Logs for use in Athena☆140Feb 1, 2024Updated 2 years ago
- ☆72Jun 3, 2024Updated last year
- The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.☆609Updated this week
- Amazon Redshift Advanced Monitoring☆270Oct 28, 2025Updated 5 months ago
- Enterprise-grade, production-hardened, serverless data lake on AWS☆479Oct 1, 2025Updated 6 months ago
- A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically c…☆401Jun 3, 2024Updated last year
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects☆49Dec 3, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,609Apr 1, 2026Updated 2 weeks ago
- Redshift Python Connector. It supports Python Database API Specification v2.0.☆218Mar 31, 2026Updated 3 weeks ago
- ☆12Mar 31, 2021Updated 5 years ago
- ☆157Feb 29, 2024Updated 2 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆687Mar 6, 2025Updated last year
- A collection of example UDFs for Amazon Redshift.☆244Oct 25, 2024Updated last year
- Glue VSCode devcontainer setup☆14Jan 31, 2023Updated 3 years ago
- 🐋 Docker image for AWS Glue Spark/Python☆23Sep 5, 2023Updated 2 years ago
- Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB☆228Apr 8, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Repository for code examples from my youtube channel and medium articles working with data in python on AWS☆29Feb 5, 2024Updated 2 years ago
- Amazon Redshift Database Loader implemented in AWS Lambda☆595Jul 16, 2024Updated last year
- This repository has moved into https://github.com/dbt-labs/dbt-adapters☆446Jul 16, 2025Updated 9 months ago
- AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.☆225Dec 12, 2025Updated 4 months ago
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- An open source development framework to help you build data workflows and modern data architecture on AWS.☆271Feb 9, 2026Updated 2 months ago
- Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines☆45Oct 20, 2022Updated 3 years ago
- PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena.☆493Apr 14, 2026Updated last week
- Amazon Redshift SQLAlchemy Dialect☆226Mar 27, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) env…☆804Oct 3, 2025Updated 6 months ago
- Python API for Deequ☆815Mar 9, 2026Updated last month
- Redshift JDBC Driver. It supports JDBC 4.2 specification.☆70Mar 5, 2026Updated last month
- A Spark library for Amazon SageMaker.☆301Mar 8, 2025Updated last year
- A workshop demonstrating the capabilities of S3, Athena, Glue, Kinesis, and Quicksight.☆158Mar 24, 2020Updated 6 years ago
- Spark runtime on AWS Lambda☆113Aug 28, 2025Updated 7 months ago
- A solution describing data-processing design pattern for streaming data through Kinesis and Spark Streaming at real-time.☆39Jun 11, 2024Updated last year