IntegriChain1 / s3parqLinks
Parquet file management in S3 for Athena / Spectrum / Presto partitioning
☆22Updated last year
Alternatives and similar repositories for s3parq
Users that are interested in s3parq are comparing it to the libraries listed below
Sorting:
- A collection of python utility functions☆11Updated this week
- Data Catalog for Databases and Data Warehouses☆36Updated 2 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 4 years ago
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Updated 3 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆115Updated last week
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated 2 years ago
- Data pipelines from re-usable components☆107Updated 2 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆62Updated 3 years ago
- ☆58Updated last month
- The sane way of building a data layer in Airflow☆24Updated 6 years ago
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.☆81Updated 2 weeks ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆110Updated 3 years ago
- Amundsen Gremlin☆22Updated 3 years ago
- Continuously synchronize directories from remote object store to local filesystem☆109Updated last week
- Dask integration for Snowflake☆30Updated 6 months ago
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆25Updated 2 years ago
- Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0☆45Updated 3 years ago
- The open source version of the Amazon Athena documentation. To submit feedback & requests for changes, submit issues in this repository, …☆84Updated 2 years ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆25Updated 7 years ago
- CLI for data platform☆20Updated 2 months ago
- pysh-db - The Data Science Toolkit (DSK)☆13Updated 7 years ago
- The best Python package for comparing two dataframes☆11Updated 4 years ago
- ☆35Updated 2 years ago
- ☆15Updated 4 years ago
- Utility functions for dbt projects running on Spark☆34Updated last month
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆25Updated last year
- ☆72Updated last year
- AWS AppSync resolver that provides GraphQL access to Athena databases☆14Updated 3 years ago
- Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker…☆84Updated 3 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆68Updated 2 years ago