getindata / doge-datagen
☆16Updated last year
Related projects ⓘ
Alternatives and complementary repositories for doge-datagen
- Library to convert DBT manifest metadata to Airflow tasks☆46Updated 8 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆43Updated 9 months ago
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆57Updated this week
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆61Updated this week
- Adapter for dbt that executes dbt pipelines on Apache Flink☆84Updated 8 months ago
- ☆13Updated 9 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 2 weeks ago
- dbt ksqlDB adapter☆27Updated 2 years ago
- ☆78Updated last year
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆41Updated last month
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆92Updated 3 weeks ago
- ☆40Updated last year
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- ☆18Updated 7 months ago
- A Table format agnostic data sharing framework☆38Updated 9 months ago
- Multi-hop declarative data pipelines☆91Updated 2 weeks ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.☆39Updated 3 weeks ago
- Receipes of publicly-available Jupyter images☆8Updated last month
- ☆49Updated 8 months ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆92Updated 2 years ago
- Profiles the data, validates the schema and runs data quality checks and produces a report☆20Updated 5 years ago
- Big Data Newsletter☆23Updated 7 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆71Updated this week
- CLI for data platform☆19Updated 11 months ago
- Kafka Connector for Iceberg tables☆16Updated last year
- ☆43Updated 3 months ago
- Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution …☆32Updated last year
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 8 months ago
- Dashboard for operating Flink jobs and deployments.☆25Updated this week