data-catering / data-caterer
Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.
☆39Updated last month
Related projects ⓘ
Alternatives and complementary repositories for data-caterer
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆57Updated this week
- A DataOps framework for building a lakehouse.☆32Updated this week
- Unity Catalog UI☆39Updated 2 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆43Updated 9 months ago
- A write-audit-publish implementation on a data lake without the JVM☆41Updated 3 months ago
- Query Snowflake tables locally with DuckDB, without any need for a running warehouse☆105Updated this week
- ☆28Updated 2 weeks ago
- Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution …☆32Updated last year
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 3 weeks ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆112Updated last week
- A Table format agnostic data sharing framework☆38Updated 9 months ago
- Sample configuration to deploy a modern data platform.☆86Updated 2 years ago
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.☆21Updated 5 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated last year
- A guide for leading a data (engineering) team☆60Updated 6 months ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆62Updated 2 months ago
- A curated list of dagster code snippets for data engineers☆52Updated 8 months ago
- New generation opensource data stack☆61Updated 2 years ago
- Data Tools Subjective List☆80Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆189Updated this week
- Delta Lake helper methods. No Spark dependency.☆22Updated 2 months ago
- A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.☆37Updated 3 months ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆92Updated last month
- A Minimalistic Rust Implementation of Delta Sharing Server.☆82Updated 3 months ago
- Delta Lake Documentation☆47Updated 5 months ago
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type…☆29Updated this week
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- Compare different technologies. No BS and all sources linked.☆13Updated 6 months ago
- The Data Product Descriptor Specification (DPDS) Repository☆70Updated last week