arthurprevot / yaetosLinks
Write data & AI pipelines in (SQL, Spark, Pandas) and deploy to the cloud, simplified
☆36Updated 8 months ago
Alternatives and similar repositories for yaetos
Users that are interested in yaetos are comparing it to the libraries listed below
Sorting:
- ☆11Updated last year
- Read Delta tables without any Spark☆47Updated last year
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆78Updated this week
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Updated this week
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 3 years ago
- A Table format agnostic data sharing framework☆42Updated last year
- Official dbt adapter for Vertica☆27Updated 6 months ago
- The dbt adapter for Firebolt☆30Updated this week
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 4 years ago
- Delta Lake helper methods. No Spark dependency.☆23Updated last year
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated 2 weeks ago
- Rules based grant management for Snowflake☆41Updated 6 years ago
- Python package for querying iceberg data through duckdb.☆71Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Updated last year
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆76Updated 4 years ago
- Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆145Updated last year
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆103Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Utility functions for dbt projects running on Spark☆34Updated 3 weeks ago
- Faker for Snowflake!☆33Updated 3 years ago
- Demos of Materialize, the operational data warehouse.☆52Updated 10 months ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- Yet Another (Spark) ETL Framework☆21Updated 2 years ago
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆235Updated 11 months ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆260Updated 2 years ago
- ☆81Updated 8 months ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆70Updated 4 months ago
- The sane way of building a data layer in Airflow☆24Updated 6 years ago
- Airflow declarative DAGs via YAML☆133Updated 2 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆115Updated this week