arthurprevot / yaetos
Write data & AI pipelines in (SQL, Spark, Pandas) and deploy to the cloud, simplified
β34Updated last month
Alternatives and similar repositories for yaetos:
Users that are interested in yaetos are comparing it to the libraries listed below
- Examples for High Performance Sparkβ15Updated 4 months ago
- π» CLI for reporting events to Faros platformβ14Updated 4 months ago
- Yet Another (Spark) ETL Frameworkβ20Updated last year
- A write-audit-publish implementation on a data lake without the JVMβ46Updated 6 months ago
- This repository contains recipes for Apache Pinot.β29Updated this week
- Using the Parquet file format with Pythonβ15Updated last year
- β οΈ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.β41Updated last month
- Delta reader for the Ray open-source toolkit for building ML applicationsβ45Updated last year
- Utility functions for dbt projects running on Sparkβ31Updated 2 weeks ago
- Delta Lake helper methods. No Spark dependency.β22Updated 5 months ago
- Amundsen Gremlinβ21Updated 2 years ago
- Profiles the data, validates the schema and runs data quality checks and produces a reportβ20Updated 5 years ago
- β10Updated 6 years ago
- Read Delta tables without any Sparkβ47Updated 11 months ago
- Data Sketches for Apache Sparkβ22Updated 2 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspectionβ18Updated 8 years ago
- A library to mutate parquet filesβ19Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoringβ24Updated 6 months ago
- The dbt adapter for Fireboltβ29Updated last month
- Demonstration of a Hive Input Format for Icebergβ26Updated 3 years ago
- An implementation of the DatasourceV2 interface of Apache Sparkβ’ for writing Spark Datasets to Apache Druidβ’.β41Updated 5 months ago
- Build a REST API on top of your data warehouseβ42Updated 2 years ago
- The sane way of building a data layer in Airflowβ24Updated 5 years ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.β12Updated last week
- dbt package for monitoring airflow DAGs and tasksβ29Updated 2 weeks ago
- Rules based grant management for Snowflakeβ40Updated 6 years ago
- PySpark phonetic and string matching algorithmsβ39Updated last year
- Lightweight configuration and access to multiple databases in a single projectβ38Updated last year
- β15Updated last year
- DataHub on AWS demonstration resourcesβ10Updated 2 years ago