agile-lab-dev / witboost-starter-kitLinks
Witboost is a versatile platform that addresses a wide range of sophisticated data engineering challenges. The Starter Kit showcases the integration capabilities and provides a "batteries-included" product.
☆25Updated 2 weeks ago
Alternatives and similar repositories for witboost-starter-kit
Users that are interested in witboost-starter-kit are comparing it to the libraries listed below
Sorting:
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- An open specification for data products in Data Mesh☆63Updated 4 months ago
- ☆100Updated 2 years ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆60Updated 3 years ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆80Updated this week
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆78Updated 9 months ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆76Updated 4 years ago
- Sample configuration to deploy a modern data platform.☆89Updated 4 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Updated 2 years ago
- A Table format agnostic data sharing framework☆42Updated 2 years ago
- Delta lake and filesystem helper methods☆50Updated last year
- A dbt (data build tool) project you can use for testing purposes or experimentation☆36Updated 2 years ago
- Delta Lake helper methods in PySpark☆327Updated 3 weeks ago
- ☆65Updated last year
- The go to demo for public and private dbt Learn☆82Updated 10 months ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆182Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆226Updated last week
- ☆110Updated last year
- Spark and Delta Lake Workshop☆22Updated 3 years ago
- Data Product Portal created by Dataminded☆198Updated this week
- Spark style guide☆271Updated last year
- Yet Another (Spark) ETL Framework☆21Updated 2 years ago
- Managing Data as a Product, published by Packt☆18Updated last year
- Template for a data contract used in a data mesh.☆486Updated last year
- Delta Lake examples☆238Updated last year
- A Python Library to support running data quality rules while the spark job is running⚡☆197Updated last week
- Weekly Data Engineering Newsletter☆96Updated last year
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆76Updated last year
- How to evaluate the Quality of your Data with Great Expectations and Spark.☆31Updated 2 years ago
- A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT …☆49Updated 3 years ago