Use SQL to build ELT pipelines on a data lakehouse.
☆289May 25, 2022Updated 3 years ago
Alternatives and similar repositories for cuelake
Users that are interested in cuelake are comparing it to the libraries listed below
Sorting:
- Airbyte clone written in Go and Vue.js. Works with Airbyte connectors.☆17Jul 24, 2021Updated 4 years ago
- Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases☆235Feb 23, 2022Updated 4 years ago
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,430Updated this week
- The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lak…☆20,840Updated this week
- Data Pipeline Framework using the singer.io spec☆658Feb 26, 2026Updated last week
- re_data - fix data issues before your users & CEO would discover them 😊☆1,569Apr 30, 2024Updated last year
- An experimental materialized view solution based on TiDB/TiKV and Flink with strong consistency support.☆65Oct 18, 2021Updated 4 years ago
- Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from…☆35Jan 5, 2023Updated 3 years ago
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆890Updated this week
- Apache Iceberg☆8,592Updated this week
- Dataform is a framework for managing SQL based data operations in BigQuery☆964Updated this week
- A simplified, lightweight ETL Framework based on Apache Spark☆587Jan 24, 2024Updated 2 years ago
- ☆13Jan 5, 2022Updated 4 years ago
- Some random how-to examples relating to Databricks.☆15Nov 3, 2021Updated 4 years ago
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,744Mar 1, 2026Updated last week
- First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business…☆1,383Mar 3, 2026Updated last week
- The metrics layer for your data. Join us at https://metriql.com/slack☆326Mar 29, 2023Updated 2 years ago
- Collect, aggregate, and visualize a data ecosystem's metadata☆2,132Mar 1, 2026Updated last week
- ☆16Nov 27, 2025Updated 3 months ago
- Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to b…☆806Aug 10, 2022Updated 3 years ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Sep 16, 2025Updated 5 months ago
- An Open Standard for lineage metadata collection☆2,340Updated this week
- Arc is an opinionated framework for defining data pipelines which are predictable, repeatable and manageable.☆171Feb 10, 2024Updated 2 years ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆261Dec 5, 2023Updated 2 years ago
- Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.☆2,241Updated this week
- Demonstration of a Hive Input Format for Iceberg☆26Mar 12, 2021Updated 4 years ago
- Data Lineage Tracking And Visualization Solution☆656Mar 3, 2026Updated last week
- A CLI and library to run Singer Taps and Targets☆35Mar 23, 2022Updated 3 years ago
- Self-serve BI to 10x your data team ⚡️☆5,601Updated this week
- Stateful Functions for Apache Flink☆279Dec 13, 2023Updated 2 years ago
- Beneath is a serverless real-time data platform ⚡️☆84Feb 18, 2022Updated 4 years ago
- Upserts, Deletes And Incremental Processing on Big Data.☆6,106Updated this week
- KNOTS is an intuitive desktop application built to simplify the configuration of Singer pipelines☆67Jan 20, 2023Updated 3 years ago
- dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models☆153Mar 2, 2026Updated last week
- ☆494Oct 21, 2022Updated 3 years ago
- sgr (command line client for Splitgraph) and the splitgraph Python library☆324Apr 30, 2024Updated last year
- Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data qua…☆759Jun 8, 2024Updated last year
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆304Oct 30, 2025Updated 4 months ago
- Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeli…☆4,672Updated this week