IntegriChain1 / s3parq
Parquet file management in S3 for Athena / Spectrum / Presto partitioning
☆22Updated 2 months ago
Alternatives and similar repositories for s3parq:
Users that are interested in s3parq are comparing it to the libraries listed below
- A collection of python utility functions☆11Updated 9 months ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated last year
- A tool to learn JSON schema from collection of documents and generate Create table statement for Redshift☆20Updated 6 months ago
- Dask integration for Snowflake☆30Updated 5 months ago
- ☆17Updated this week
- Data Catalog for Databases and Data Warehouses☆34Updated last year
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- ☆53Updated last year
- Amundsen Gremlin☆21Updated 2 years ago
- A CLI to manage and monitor permissions in AWS Lake Formation☆26Updated 2 years ago
- lakeview is a visibility tool for S3 based data lakes☆29Updated last year
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- A conda-smithy repository for python-duckdb.☆13Updated last week
- Code samples related to "Harmonize, Search, and Analyze Loosely Coupled Datasets on AWS" (https://aws.amazon.com/blogs/big-data/harmonize…☆22Updated 5 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Continuously synchronize directories from remote object store to local filesystem☆103Updated 2 months ago
- CLI for data platform☆19Updated last year
- 🐋 Docker image for AWS Glue Spark/Python☆23Updated last year
- Example Set up For DBT Cloud using Github Integrations☆11Updated 5 years ago
- ☆19Updated 5 years ago
- ☆24Updated 5 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- ☆73Updated 10 months ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- An experimental Athena extension for DuckDB 🐤☆54Updated 3 months ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Derivatives models written with the Tributary data flow library☆23Updated last week