developer-advocacy-dremio / definitive-guide-to-apache-iceberg
☆47Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for definitive-guide-to-apache-iceberg
- Delta reader for the Ray open-source toolkit for building ML applications☆42Updated 9 months ago
- ☆43Updated 3 months ago
- A Table format agnostic data sharing framework☆38Updated 9 months ago
- A write-audit-publish implementation on a data lake without the JVM☆41Updated 2 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated last week
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆111Updated this week
- Multi-hop declarative data pipelines☆91Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆144Updated this week
- Schema modelling framework for decentralised domain-driven ownership of data.☆247Updated 11 months ago
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆60Updated this week
- The shared semantic layer definitions that dbt-core and MetricFlow use.☆72Updated this week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated last year
- Sample configuration to deploy a modern data platform.☆86Updated 2 years ago
- Weekly Data Engineering Newsletter☆93Updated 3 months ago
- Full stack data engineering tools and infrastructure set-up☆41Updated 3 years ago
- Pythonic Iceberg REST Catalog☆65Updated last month
- A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.☆37Updated 2 months ago
- Unity Catalog UI☆39Updated 2 months ago
- ☆150Updated 3 weeks ago
- ☆32Updated 5 months ago
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆214Updated this week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆96Updated last year
- Read Delta tables without any Spark☆47Updated 8 months ago
- A Python Library to support running data quality rules while the spark job is running⚡☆162Updated this week
- Delta Lake helper methods. No Spark dependency.☆22Updated 2 months ago
- Adapter for dbt that executes dbt pipelines on Apache Flink