starlake-ai / starlake
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
☆57Updated this week
Related projects ⓘ
Alternatives and complementary repositories for starlake
- Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.☆39Updated 3 weeks ago
- A DataOps framework for building a lakehouse.☆32Updated this week
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆111Updated last week
- dbt's adapter for dremio☆48Updated 2 years ago
- Library to convert DBT manifest metadata to Airflow tasks☆46Updated 8 months ago
- Alto is a versatile data integration tool that allows you to easily run Singer plugins, build and cache PEX files encapsulating those plu…☆56Updated last year
- Data Tools Subjective List☆80Updated last year
- dbt (data build tool) adapter for the Dremio☆44Updated this week
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆92Updated 3 weeks ago
- Delta lake and filesystem helper methods☆49Updated 8 months ago
- Query Snowflake tables locally with DuckDB, without any need for a running warehouse☆101Updated this week
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type…☆29Updated this week
- A Table format agnostic data sharing framework☆38Updated 9 months ago
- A write-audit-publish implementation on a data lake without the JVM☆41Updated 3 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆71Updated this week
- Pytest plugin for dbt core☆58Updated 5 months ago
- Unity Catalog UI☆39Updated 2 months ago
- A tool that makes it easy to run modular Trino environments locally.☆33Updated this week
- Python wrapper for the Sling CLI tool☆43Updated last month
- Utility functions for dbt projects running on Spark☆31Updated last year
- Adapter for dbt that executes dbt pipelines on Apache Flink☆84Updated 8 months ago
- Make dbt docs and Apache Superset talk to one another☆137Updated 2 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆43Updated 9 months ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆247Updated 11 months ago
- A Minimalistic Rust Implementation of Delta Sharing Server.☆81Updated 3 months ago
- Data product portal created by Dataminded☆148Updated this week
- Docker envinroment to stream data from Kafka to Iceberg tables☆24Updated 8 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆60Updated last year
- Define, govern, and model event data for warehouse-first product analytics.☆82Updated 4 months ago
- Sample configuration to deploy a modern data platform.☆86Updated 2 years ago