onehouseinc / lake-loaderLinks
A tool to benchmark L (loading) workloads within ETL workloads
☆27Updated 4 months ago
Alternatives and similar repositories for lake-loader
Users that are interested in lake-loader are comparing it to the libraries listed below
Sorting:
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆64Updated last year
- Multi-hop declarative data pipelines☆118Updated last week
- ☆95Updated 8 months ago
- a curated list of awesome lakehouse frameworks, applications, etc☆35Updated 6 months ago
- Yet Another (Spark) ETL Framework☆21Updated last year
- ☆59Updated last year
- Presto Trino with Apache Hive Postgres metastore☆43Updated last year
- 🌟 Examples of use cases that utilize Decodable, as well as demos for related open-source projects such as Apache Flink, Debezium, and Po…☆83Updated 2 months ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆63Updated 2 weeks ago
- In-Memory Analytics for Kafka using DuckDB☆137Updated this week
- A Table format agnostic data sharing framework☆38Updated last year
- ☆40Updated 2 years ago
- Comptaction runtime for Apache Iceberg.☆78Updated this week
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated last week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆100Updated 2 years ago
- Resource for the book Trino: The Definitive Guide (and formerly Presto: The Definitive Guide)☆229Updated 2 years ago
- A testing framework for Trino☆26Updated 5 months ago
- Management and automation platform for Stateful Distributed Systems☆110Updated this week
- Mock streaming data generator☆17Updated last year
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆81Updated 5 months ago
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆43Updated 2 months ago
- Apache iceberg Spark s3 examples☆20Updated last year
- Apache Kafka is an open-source distributed event streaming platform used by thousands of companies. uForwarder aims to address several pa…☆86Updated 6 months ago
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆173Updated 3 weeks ago
- ACID Data Source for Apache Spark based on Hive ACID☆97Updated 4 years ago
- A highly available and infinitely scalable, drop-in replacement for Kafka Streams☆18Updated 3 months ago
- A library for Spark DataFrame using MinIO Select API☆99Updated 5 years ago
- ☆58Updated last week
- MemQ is an efficient, scalable cloud native PubSub system☆138Updated 2 weeks ago
- Apache Hive Metastore as a Standalone server in Docker☆80Updated last year