deordie / deordie-digest
Data Engineering Digest
☆28Updated 9 months ago
Alternatives and similar repositories for deordie-digest:
Users that are interested in deordie-digest are comparing it to the libraries listed below
- Command-line interface to quickly generate fake CSV and JSON data☆72Updated 9 months ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆9Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- A Table format agnostic data sharing framework☆38Updated last year
- Yet Another (Spark) ETL Framework☆20Updated last year
- Magic to help Spark pipelines upgrade☆34Updated 6 months ago
- Weekly Data Engineering Newsletter☆94Updated 9 months ago
- Kafka Connector for Iceberg tables☆16Updated last year
- Flowchart for debugging Spark applications☆105Updated 6 months ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆58Updated last year
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆60Updated 3 months ago
- A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.☆40Updated 8 months ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- ☆18Updated 3 years ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆68Updated 6 months ago
- ☆53Updated 8 months ago
- Sample configuration to deploy a modern data platform.☆88Updated 3 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- ITSumma Spark Greenplum Connector☆37Updated last year
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆264Updated 3 weeks ago
- Delta Lake helper methods. No Spark dependency.☆23Updated 7 months ago
- Aiven's S3 Sink Connector for Apache Kafka®☆69Updated 7 months ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆74Updated 3 years ago
- ☆16Updated this week
- DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.☆57Updated last year
- ☆28Updated 4 months ago
- Library to convert DBT manifest metadata to Airflow tasks☆48Updated last year
- Airflow declarative DAGs via YAML☆132Updated last year
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆27Updated 3 months ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆26Updated 8 months ago