getindata / doge-datagen
☆18Updated 2 years ago
Alternatives and similar repositories for doge-datagen:
Users that are interested in doge-datagen are comparing it to the libraries listed below
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- Library to convert DBT manifest metadata to Airflow tasks☆47Updated 10 months ago
- ☆47Updated 5 months ago
- Adapter for dbt that executes dbt pipelines on Apache Flink☆88Updated 10 months ago
- dbt's adapter for dremio☆48Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆93Updated this week
- Big Data Newsletter☆25Updated 9 months ago
- ☆18Updated 9 months ago
- ☆14Updated 11 months ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated last year
- CLI for data platform☆19Updated last year
- ☆79Updated last year
- ☆62Updated this week
- Flowchart for debugging Spark applications☆104Updated 3 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆43Updated 11 months ago
- A Table format agnostic data sharing framework☆38Updated 11 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆74Updated this week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆97Updated last year
- Examples for High Performance Spark☆15Updated 2 months ago
- Docker envinroment to stream data from Kafka to Iceberg tables☆24Updated 10 months ago
- Code snippets used in demos recorded for the blog.☆29Updated this week
- ☆40Updated last year
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆9Updated 11 months ago
- Multi-hop declarative data pipelines☆103Updated this week
- dbt ksqlDB adapter☆27Updated 2 years ago
- ☆25Updated 4 months ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆70Updated this week
- Magic to help Spark pipelines upgrade☆34Updated 3 months ago