fabiogouw / spark-aws-messaging
A custom sink provider for Apache Spark that sends the content of a dataframe to an AWS SQS
☆19Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for spark-aws-messaging
- Example code for running Spark and Hive jobs on EMR Serverless.☆151Updated 2 weeks ago
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…☆205Updated 6 months ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- EMR Hudi Workshop content☆12Updated 2 years ago
- A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational e…☆102Updated this week
- This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon …☆16Updated 3 years ago
- Performant Redshift data source for Apache Spark☆136Updated 3 months ago
- ☆42Updated 2 years ago
- A library that provides useful extensions to Apache Spark and PySpark.☆196Updated last week
- Spark style guide☆257Updated last month
- ☆38Updated last month
- Task Metrics Explorer☆13Updated 5 years ago
- Spark runtime on AWS Lambda☆93Updated last month
- Automated data quality suggestions and analysis with Deequ on AWS Glue☆83Updated last year
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆376Updated last week
- Airflow helm chart for AWS EKS☆18Updated 3 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆19Updated 2 months ago
- Kinesis Connector for Structured Streaming☆137Updated 4 months ago
- Snowflake Data Source for Apache Spark.☆217Updated this week
- This repository contains the dbt-glue adapter☆99Updated last week
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆33Updated 11 months ago
- The Internals of Delta Lake☆182Updated last month
- AWS Glue Schema Registry Client library provides serializers / de-serializers for applications to integrate with AWS Glue Schema Registry…☆131Updated this week
- Delta Lake examples☆205Updated last month
- A Python Library to support running data quality rules while the spark job is running⚡☆162Updated this week
- A simplified, lightweight ETL Framework based on Apache Spark☆584Updated 9 months ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆65Updated 2 years ago
- Spark Standalone & Livy☆12Updated 3 years ago