onehouseinc / lake-loaderLinks
A tool to benchmark L (loading) workloads within ETL workloads
☆29Updated last week
Alternatives and similar repositories for lake-loader
Users that are interested in lake-loader are comparing it to the libraries listed below
Sorting:
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Updated 2 years ago
- ☆40Updated 2 years ago
- A Table format agnostic data sharing framework☆42Updated last year
- Compaction runtime for Apache Iceberg.☆111Updated this week
- Multi-hop declarative data pipelines☆122Updated 2 weeks ago
- ☆105Updated 10 months ago
- ☆59Updated last week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆102Updated 2 years ago
- Yet Another (Spark) ETL Framework☆21Updated 2 years ago
- Stackable Operator for Apache Airflow☆32Updated last week
- Presto Trino with Apache Hive Postgres metastore☆43Updated last year
- a curated list of awesome lakehouse frameworks, applications, etc☆36Updated last week
- ☆63Updated last year
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆168Updated 2 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆77Updated last week
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆184Updated 3 weeks ago
- ☆81Updated 7 months ago
- Mock streaming data generator☆17Updated last year
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆142Updated 3 months ago
- ☆269Updated last year
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆83Updated 7 months ago
- Trino connectors for accessing APIs with an OpenAPI spec☆41Updated last week
- 🌟 Examples of use cases that utilize Decodable, as well as demos for related open-source projects such as Apache Flink, Debezium, and Po…☆85Updated 5 months ago
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆290Updated last week
- Resource for the book Trino: The Definitive Guide (and formerly Presto: The Definitive Guide)☆230Updated 3 years ago
- Sparglim✨ makes PySpark App Configurable and Deploy Spark Connect Server Easier!☆39Updated 3 weeks ago
- The observability platform for Iceberg lakehouses.☆401Updated last week
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆77Updated 4 years ago
- Dashboard for operating Flink jobs and deployments.☆41Updated 2 months ago
- Apache Kafka is an open-source distributed event streaming platform used by thousands of companies. uForwarder aims to address several pa…☆95Updated 2 months ago