onehouseinc / lake-loaderLinks
A tool to benchmark L (loading) workloads within ETL workloads
☆30Updated 3 weeks ago
Alternatives and similar repositories for lake-loader
Users that are interested in lake-loader are comparing it to the libraries listed below
Sorting:
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Updated 2 years ago
- Multi-hop declarative data pipelines☆124Updated last week
- 🌟 Examples of use cases that utilize Decodable, as well as demos for related open-source projects such as Apache Flink, Debezium, and Po…☆87Updated 7 months ago
- Presto Trino with Apache Hive Postgres metastore☆43Updated last year
- ☆107Updated last year
- Apache Kafka is an open-source distributed event streaming platform used by thousands of companies. uForwarder aims to address several pa…☆101Updated 3 months ago
- In-Memory Analytics for Kafka using DuckDB☆147Updated this week
- A Table format agnostic data sharing framework☆42Updated last year
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12Updated last week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆103Updated 2 years ago
- ☆65Updated last year
- Management and automation platform for Stateful Distributed Systems☆110Updated last week
- Compaction runtime for Apache Iceberg.☆114Updated this week
- A testing framework for Trino☆26Updated 10 months ago
- ☆61Updated last week
- Yet Another (Spark) ETL Framework☆21Updated 2 years ago
- Mock streaming data generator☆17Updated last year
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆188Updated last month
- ☆40Updated 2 years ago
- ☆81Updated 9 months ago
- Dashboard for operating Flink jobs and deployments.☆43Updated 4 months ago
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆169Updated 4 months ago
- Docker envinroment to stream data from Kafka to Iceberg tables☆30Updated last year
- DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from da…☆50Updated 2 months ago
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆83Updated 9 months ago
- Sparglim✨ makes PySpark App Configurable and Deploy Spark Connect Server Easier!☆41Updated last week
- a curated list of awesome lakehouse frameworks, applications, etc☆40Updated 2 months ago
- Explore Apache Kafka data pipelines in Kubernetes.☆47Updated 6 months ago
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆146Updated this week
- Iceberg Playground in a Box☆67Updated 7 months ago