awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore)

awslabs / aws-glue-data-catalog-client-for-apache-hive-metastore

The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an extern…

☆230

Alternatives and similar repositories for aws-glue-data-catalog-client-for-apache-hive-metastore

Users that are interested in aws-glue-data-catalog-client-for-apache-hive-metastore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

awslabs / aws-glue-catalog-sync-agent-for-hive
View on GitHub
Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog
☆37Dec 5, 2023Updated 2 years ago
tinyclues / spark-glue-data-catalog
View on GitHub
Apache Spark build compatible with AWS Glue Data Catalog.
☆19Aug 9, 2021Updated 4 years ago
viaduct-ai / docker-spark-k8s-aws
View on GitHub
Docker image for running Spark 3 on Kubernetes on AWS
☆26May 26, 2021Updated 5 years ago
awslabs / aws-glue-libs
View on GitHub
AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
☆702Jul 1, 2026Updated 3 weeks ago
awsdocs / aws-glue-developer-guide
View on GitHub
The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by maki…
☆201Jun 15, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
aws-samples / aws-glue-samples
View on GitHub
AWS Glue code samples
☆1,539Jun 8, 2026Updated last month
devindatt / Spark-AWS-ETL
View on GitHub
Building an ETL process using Spark EMR in AWS
☆10Jun 27, 2019Updated 7 years ago
awslabs / aws-glue-data-catalog-federation
View on GitHub
☆12Updated this week
awslabs / aws-glue-streaming-libs
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
ExpediaGroup / drone-fly
View on GitHub
A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service
☆13Jun 30, 2026Updated 3 weeks ago
AbsaOSS / spline-spark-agent
View on GitHub
Spline agent for Apache Spark
☆207Updated this week
sdaberdaku / spark-with-glue-builder
View on GitHub
Docker image that builds a patched Apache Spark with AWS Glue support as metastore
☆18Jun 8, 2024Updated 2 years ago
aws-samples / emr-remote-shuffle-service
View on GitHub
☆18May 7, 2026Updated 2 months ago
AbsaOSS / spline
View on GitHub
Data Lineage Tracking And Visualization Solution
☆664Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
awslabs / amazon-athena-cross-account-catalog
View on GitHub
🌉 Reference implementation for granting cross-account AWS Glue Data Catalog access from Amazon Athena
☆30Jul 25, 2022Updated 4 years ago
KubedAI / spark-history-server
View on GitHub
Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs
☆30Apr 4, 2026Updated 3 months ago
Netflix / iceberg
View on GitHub
Iceberg is a table format for large, slow-moving tabular data
☆494Apr 10, 2023Updated 3 years ago
awslabs / amazon-emr-on-eks-custom-image-cli
View on GitHub
Amazon EMR on EKS Custom Image CLI
☆32Sep 26, 2024Updated last year
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,638Jul 21, 2026Updated last week
aws-samples / dbtgluenyctaxidemo
View on GitHub
☆11Oct 11, 2022Updated 3 years ago
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated last month
cerndb / sparkMeasure
View on GitHub
This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…
☆16May 21, 2026Updated 2 months ago
tiagotxm / yt-spark-no-kubernetes
View on GitHub
☆13Feb 19, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
aws / aws-sdk-pandas
View on GitHub
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoD…
☆4,116Updated this week
datafusion-contrib / datafusion-catalogprovider-glue
View on GitHub
☆23May 2, 2024Updated 2 years ago
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Updated this week
awslabs / aws-glue-blueprint-libs
View on GitHub
☆71May 8, 2026Updated 2 months ago
jupyter-incubator / sparkmagic
View on GitHub
Jupyter magics and kernels for working with remote Spark clusters
☆1,364Sep 9, 2025Updated 10 months ago
hortonworks-spark / spark-atlas-connector
View on GitHub
A Spark Atlas connector to track data lineage in Apache Atlas
☆268Nov 16, 2022Updated 3 years ago
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,925Updated this week
aws-samples / amazon-deequ-glue
View on GitHub
Automated data quality suggestions and analysis with Deequ on AWS Glue
☆93Dec 29, 2022Updated 3 years ago
awslabs / amazon-emr-vscode-toolkit
View on GitHub
A VS Code Extension to make it easier to manage and develop Spark jobs on EMR
☆39Feb 17, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
purecloudlabs / aws_glue_etl_docker
View on GitHub
Helper library to run AWS Glue ETL scripts docker container for local testing of development in a Jupyter notebook
☆20Feb 13, 2024Updated 2 years ago
awslabs / amazon-kinesis-connector-flink
View on GitHub
This is a fork of the Apache Flink Kinesis connector adding Enhanced Fanout support for Flink 1.8/1.11 on KDA.
☆24Mar 1, 2026Updated 4 months ago
memiiso / debezium-server-iceberg
View on GitHub
Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake
☆324Updated this week
imapi / spark-sqs-receiver
View on GitHub
Spark SQS Amazon queue receiver
☆24Nov 22, 2021Updated 4 years ago
awslabs / athena-glue-service-logs
View on GitHub
Glue scripts for converting AWS Service Logs for use in Athena
☆139Feb 1, 2024Updated 2 years ago
lyft / presto-gateway
View on GitHub
A load balancer / proxy / gateway for prestodb
☆359Jul 25, 2024Updated 2 years ago
A3Data / pyspark-notebook-helm
View on GitHub
This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.
☆17Nov 16, 2022Updated 3 years ago