IntegriChain1 / s3parqLinks
Parquet file management in S3 for Athena / Spectrum / Presto partitioning
☆22Updated 6 months ago
Alternatives and similar repositories for s3parq
Users that are interested in s3parq are comparing it to the libraries listed below
Sorting:
- A collection of python utility functions☆11Updated last year
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- Data pipelines from re-usable components☆107Updated 2 years ago
- ☆53Updated last week
- Dask integration for Snowflake☆30Updated 2 weeks ago
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated 2 years ago
- CLI for data platform☆19Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- The open source version of the Amazon Athena documentation. To submit feedback & requests for changes, submit issues in this repository, …☆83Updated 2 years ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated 2 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated 2 years ago
- Continuously synchronize directories from remote object store to local filesystem☆106Updated 6 months ago
- Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆110Updated 3 weeks ago
- Examples of various flow deployments for Prefect 1.0 (storage and run configurations)☆35Updated 3 years ago
- Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0☆45Updated 2 years ago
- Utilities for creating ETL pipelines with mara☆36Updated 3 years ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆111Updated 2 years ago
- An experimental Athena extension for DuckDB 🐤☆54Updated 7 months ago
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆90Updated 2 months ago
- Deploy a Prefect flow to serverless AWS Lambda function☆35Updated 2 years ago
- The best Python package for comparing two dataframes☆11Updated 3 years ago
- Palm CLI - the tool-belt for data teams☆47Updated last year
- ☆34Updated 2 years ago
- ☆27Updated 2 weeks ago
- A CLI to manage and monitor permissions in AWS Lake Formation☆26Updated 2 years ago
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.☆79Updated 2 weeks ago
- CLI tool to launch Spark jobs on AWS EMR☆67Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year