aws-samples / emr-presto-query-event-listener
Implementation of the query event listener plugin in Java to log Presto statistics on Amazon EMR for auditing and performance insights
☆13Updated 6 years ago
Alternatives and similar repositories for emr-presto-query-event-listener:
Users that are interested in emr-presto-query-event-listener are comparing it to the libraries listed below
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated 11 months ago
- Presto Gateway routes query based on policy.☆12Updated 4 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- ☆24Updated last year
- ☆14Updated last week
- Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.☆71Updated last year
- ☆34Updated 3 years ago
- Apache Ranger Plugin for S3☆19Updated 2 years ago
- Testbench for experimenting with Apache Hive at any data scale.☆64Updated 7 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.☆18Updated 7 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Updated 8 years ago
- Spark 3.0.0 Structured Streaming Kafka Avro Demo☆15Updated last year
- Quickly deploy Hadoop with the help of Ansible and Apache Ambari☆37Updated 9 years ago
- A logstash codec plugin for decoding and encoding Avro records☆15Updated 8 months ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated 10 months ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- Presto K8S Operator☆9Updated 4 years ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Updated last year
- Examples of Spark 3.0☆47Updated 4 years ago
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆35Updated last year
- Framework for running macro benchmarks in a clustered environment☆24Updated 2 years ago
- Spark structured streaming with Kafka data source and writing to Cassandra☆62Updated 5 years ago
- presto for Cloudera Manager parcel☆21Updated 8 years ago
- Hadoop Cluster Configurations☆32Updated 3 years ago
- install Cloudera's distribution of Hadoop including Cloudera Manager and Cloudera Search (Beta)☆31Updated 11 years ago
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆10Updated 2 years ago
- Docker Image for Kudu☆38Updated 6 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆13Updated last year
- An example of building kubernetes operator (Flink) using Abstract operator's framework☆26Updated 5 years ago