onehouseinc / lake-loaderLinks
A tool to benchmark L (loading) workloads within ETL workloads
☆26Updated 2 months ago
Alternatives and similar repositories for lake-loader
Users that are interested in lake-loader are comparing it to the libraries listed below
Sorting:
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆64Updated last year
- ☆40Updated 2 years ago
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12Updated last week
- 🌟 Examples of use cases that utilize Decodable, as well as demos for related open-source projects such as Apache Flink, Debezium, and Po…☆79Updated last month
- Multi-hop declarative data pipelines☆117Updated last week
- ☆58Updated 11 months ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆99Updated 2 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆61Updated 7 months ago
- A testing framework for Trino☆26Updated 4 months ago
- ☆89Updated 6 months ago
- A Table format agnostic data sharing framework☆38Updated last year
- Apache Kafka is an open-source distributed event streaming platform used by thousands of companies. uForwarder aims to address several pa…☆77Updated 4 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated last week
- Comptaction runtime for Apache Iceberg.☆58Updated last week
- Resource for the book Trino: The Definitive Guide (and formerly Presto: The Definitive Guide)☆226Updated 2 years ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆45Updated 2 weeks ago
- Distributed SQL query engine for big data☆49Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆163Updated 8 months ago
- ☆39Updated 3 weeks ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆172Updated last month
- Yet Another (Spark) ETL Framework☆21Updated last year
- In-Memory Analytics for Kafka using DuckDB☆132Updated last week
- a curated list of awesome lakehouse frameworks, applications, etc☆34Updated 5 months ago
- This project provides fully automated one-click experience to create Cloud and Kubernetes environment to run Data Analytics workload like…☆55Updated 2 years ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- ☆58Updated last week
- Mock streaming data generator☆17Updated last year
- Apache iceberg Spark s3 examples☆20Updated last year
- Presto Trino with Apache Hive Postgres metastore☆43Updated 10 months ago