IntegriChain1 / s3parq
Parquet file management in S3 for Athena / Spectrum / Presto partitioning
☆22Updated last month
Alternatives and similar repositories for s3parq:
Users that are interested in s3parq are comparing it to the libraries listed below
- Data Catalog for Databases and Data Warehouses☆33Updated last year
- A CLI to manage and monitor permissions in AWS Lake Formation☆27Updated 2 years ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- A conda-smithy repository for python-duckdb.☆13Updated last week
- Amundsen Gremlin☆21Updated 2 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated last year
- A collection of python utility functions☆11Updated 8 months ago
- Derivatives models written with the Tributary data flow library☆23Updated 4 months ago
- A tool to learn JSON schema from collection of documents and generate Create table statement for Redshift☆19Updated 5 months ago
- Puppet module to provision Airbnb's Airflow☆19Updated 2 years ago
- AWS Quick Start Team☆18Updated 5 months ago
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆31Updated last month
- Dask integration for Snowflake☆30Updated 4 months ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Example Set up For DBT Cloud using Github Integrations☆11Updated 5 years ago
- This repository is no longer maintained.☆15Updated 3 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated last year
- ☆19Updated 3 months ago
- a pytest plugin for dbt adapter test suites☆19Updated last year
- An experimental Athena extension for DuckDB 🐤☆54Updated 2 months ago
- lakeview is a visibility tool for S3 based data lakes☆29Updated last year
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆23Updated last year
- ☆47Updated last week
- pysh-db - The Data Science Toolkit (DSK)☆13Updated 6 years ago
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- CLI tool to launch Spark jobs on AWS EMR☆67Updated last year
- Continuously synchronize directories from remote object store to local filesystem☆104Updated last month
- Automatically loads new partitions in AWS Athena☆18Updated 4 years ago