wirelessr / flink-iceberg-playground
minio as local storage and DynamoDB as catalog
☆13Updated 8 months ago
Alternatives and similar repositories for flink-iceberg-playground:
Users that are interested in flink-iceberg-playground are comparing it to the libraries listed below
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 9 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆13Updated last year
- Demonstration of a Hive Input Format for Iceberg☆26Updated 3 years ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆17Updated 3 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- Automatically loads new partitions in AWS Athena☆18Updated 4 years ago
- This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium☆27Updated 4 years ago
- Hadoop/Hive/Spark container to perform CI tests☆11Updated 4 years ago
- Dione - a Spark and HDFS indexing library☆50Updated 9 months ago
- A testing framework for Trino☆26Updated last month
- DataHub on AWS demonstration resources☆10Updated last year
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 9 years ago
- Cloud Storage Connector integrates Apache Pulsar with cloud storage.☆28Updated this week
- Demos of Materialize, the operational data warehouse.☆51Updated 4 months ago
- A tool to learn JSON schema from collection of documents and generate Create table statement for Redshift☆19Updated 3 months ago
- AWS Quick Start Team☆14Updated 3 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 3 weeks ago
- Demos using Conduktor Gateway☆16Updated 9 months ago
- ☆47Updated 5 months ago
- ☆35Updated 3 weeks ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- Amundsen Gremlin☆20Updated 2 years ago
- A curated list of Apache Pulsar resources☆13Updated 6 years ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆65Updated last year
- Set of tools for creating backups, compaction and restoration of Apache Kafka® Clusters☆19Updated this week
- Some AWS EMR examples☆16Updated 7 years ago
- Java implementation for performing operations on Apache Iceberg and Hive tables☆20Updated 3 months ago
- stream data generator☆14Updated 6 months ago
- GetInData Helm Charts repository☆12Updated 2 years ago